Skip to main content

Day-2 Overview

Purpose: For operators, explains what day-2 operations means for openCenter including operational responsibilities and maintenance windows.

What Day-2 Means

Day-2 operations cover everything that happens after a cluster is provisioned and running. Day-1 gets you a working cluster; day-2 keeps it healthy, secure, and current. In openCenter, most day-2 work flows through Git — changes are proposed as PRs, reviewed, merged, and reconciled by FluxCD.

Operational Categories

Automated by FluxCD

These operations happen continuously without manual intervention once configured:

  • Service reconciliation — FluxCD watches the GitOps repository and applies any drift back to the desired state.
  • HelmRelease upgrades — Merging a version bump in a HelmRelease manifest triggers an automatic rollout.
  • Secret decryption — SOPS-encrypted secrets are decrypted on-the-fly during reconciliation.
  • Kyverno policy enforcement — Policy changes merged to the repository are applied within the reconciliation interval (default: 10 minutes).

Operator-Initiated via CLI

These require an operator to run a command or merge a PR:

  • Kubernetes version upgrades — Kubespray playbook re-run after updating version variables. See Kubernetes Upgrades.
  • Node replacement — Drain, remove, and re-provision through Kubespray inventory changes. See Node Replacement.
  • Backup and restore — Velero schedule configuration and restore operations. See Backup & Restore.
  • Key rotationopencenter cluster rotate-keys for SOPS Age keys (90-day cycle) and SSH keys (180-day cycle).
  • Drift detectionopencenter cluster drift to compare live state against the Git repository.

Manual / Infrastructure-Level

These touch the underlying infrastructure and typically require provider console access or Terraform changes:

  • Control plane resizing — VM CPU/memory changes with rolling restarts. See Resize Control Plane.
  • Disk expansion — Adding or resizing VM disks. See Add VM Disks.
  • Provider migration — Moving clusters between VMware, OpenStack, or other providers. See Migration Planning.

Maintenance Windows

openCenter does not enforce maintenance windows at the platform level. Operators define their own schedules. A typical approach:

  1. Kubernetes upgrades — Schedule during low-traffic periods. Kubespray performs rolling updates, but API server restarts cause brief control plane unavailability.
  2. Service upgrades — FluxCD rolls out changes as soon as they merge. To gate rollouts, use PR approval workflows and merge during maintenance windows.
  3. Node operations — Draining a node evicts workloads. Ensure sufficient capacity on remaining nodes before starting.

Monitoring Operational Health

# Check FluxCD reconciliation status across all Kustomizations
flux get kustomizations

# Check HelmRelease status
flux get helmreleases -A

# View recent FluxCD events
kubectl get events -n flux-system --sort-by='.lastTimestamp'

# Check key expiration status
opencenter cluster check-keys <cluster-name>

# Run drift detection
opencenter cluster drift <cluster-name>

Responsibility Model

AreaOwnerMechanism
Kubernetes versionPlatform teamKubespray re-run
Platform servicesPlatform teamFluxCD + gitops-base
Application deploymentsApp teamsFluxCD + app repos
Secrets rotationPlatform teamopenCenter CLI
Infrastructure (VMs, networking)Platform teamTerraform / provider console
Backup schedulesPlatform teamVelero configuration
Policy updatesPlatform teamKyverno via gitops-base

Further Reading