Day-2 Overview
Purpose: For operators, explains what day-2 operations means for openCenter including operational responsibilities and maintenance windows.
What Day-2 Means
Day-2 operations cover everything that happens after a cluster is provisioned and running. Day-1 gets you a working cluster; day-2 keeps it healthy, secure, and current. In openCenter, most day-2 work flows through Git — changes are proposed as PRs, reviewed, merged, and reconciled by FluxCD.
Operational Categories
Automated by FluxCD
These operations happen continuously without manual intervention once configured:
- Service reconciliation — FluxCD watches the GitOps repository and applies any drift back to the desired state.
- HelmRelease upgrades — Merging a version bump in a HelmRelease manifest triggers an automatic rollout.
- Secret decryption — SOPS-encrypted secrets are decrypted on-the-fly during reconciliation.
- Kyverno policy enforcement — Policy changes merged to the repository are applied within the reconciliation interval (default: 10 minutes).
Operator-Initiated via CLI
These require an operator to run a command or merge a PR:
- Kubernetes version upgrades — Kubespray playbook re-run after updating version variables. See Kubernetes Upgrades.
- Node replacement — Drain, remove, and re-provision through Kubespray inventory changes. See Node Replacement.
- Backup and restore — Velero schedule configuration and restore operations. See Backup & Restore.
- Key rotation —
opencenter cluster rotate-keysfor SOPS Age keys (90-day cycle) and SSH keys (180-day cycle). - Drift detection —
opencenter cluster driftto compare live state against the Git repository.
Manual / Infrastructure-Level
These touch the underlying infrastructure and typically require provider console access or Terraform changes:
- Control plane resizing — VM CPU/memory changes with rolling restarts. See Resize Control Plane.
- Disk expansion — Adding or resizing VM disks. See Add VM Disks.
- Provider migration — Moving clusters between VMware, OpenStack, or other providers. See Migration Planning.
Maintenance Windows
openCenter does not enforce maintenance windows at the platform level. Operators define their own schedules. A typical approach:
- Kubernetes upgrades — Schedule during low-traffic periods. Kubespray performs rolling updates, but API server restarts cause brief control plane unavailability.
- Service upgrades — FluxCD rolls out changes as soon as they merge. To gate rollouts, use PR approval workflows and merge during maintenance windows.
- Node operations — Draining a node evicts workloads. Ensure sufficient capacity on remaining nodes before starting.
Monitoring Operational Health
# Check FluxCD reconciliation status across all Kustomizations
flux get kustomizations
# Check HelmRelease status
flux get helmreleases -A
# View recent FluxCD events
kubectl get events -n flux-system --sort-by='.lastTimestamp'
# Check key expiration status
opencenter cluster check-keys <cluster-name>
# Run drift detection
opencenter cluster drift <cluster-name>
Responsibility Model
| Area | Owner | Mechanism |
|---|---|---|
| Kubernetes version | Platform team | Kubespray re-run |
| Platform services | Platform team | FluxCD + gitops-base |
| Application deployments | App teams | FluxCD + app repos |
| Secrets rotation | Platform team | openCenter CLI |
| Infrastructure (VMs, networking) | Platform team | Terraform / provider console |
| Backup schedules | Platform team | Velero configuration |
| Policy updates | Platform team | Kyverno via gitops-base |
Further Reading
- PR Workflows — How changes flow through Git review before reaching clusters.
- Drift Detection — Detecting and resolving configuration drift.
- Service Upgrades — Rolling out new service versions via FluxCD.