Skip to main content

Production Installation

Purpose: For platform engineers, covers production-grade installation with HA control plane, load balancer, private registry, TLS certificates, and backup configuration.

Pre-flight Checklist

ItemRequirementVerify
Control plane nodes3 VMs (minimum)Odd number for etcd quorum
Worker nodes3+ VMsSized for workload
Load balancerVIP or cloud LB for API serverReachable from all nodes
Private registryHarbor or equivalentDNS resolvable, TLS configured
DNSWildcard record for cluster domain*.cluster.example.com
NTPSynchronized clocks<1s skew
SSH accessKey-based, passwordless sudoFrom bastion to all nodes
openCenter CLIInstalled on operator workstationopencenter version

Sizing Recommendations

RoleCPUMemoryDiskCount
Control plane4 vCPU8 GiB100 GiB SSD3
Worker (general)8 vCPU32 GiB200 GiB SSD3–10
Worker (monitoring)4 vCPU16 GiB500 GiB SSD1–2
Bastion2 vCPU4 GiB50 GiB1
Load balancer2 vCPU4 GiB20 GiB2 (HA pair)

For clusters running the full observability stack (Prometheus, Loki, Tempo), add dedicated monitoring workers with larger disks for time-series and log storage.

Step 1 — Initialize Cluster Configuration

opencenter cluster init prod-cluster --org mycompany --type openstack
opencenter cluster use mycompany/prod-cluster

Step 2 — Configure HA Control Plane (3 Nodes)

# In the cluster configuration
opencenter:
cluster:
cluster_name: prod-cluster
kubernetes:
version: "1.33.5"
api_port: 6443
infrastructure:
compute:
master_count: 3
worker_count: 5
flavor_master: m1.xlarge
flavor_worker: m1.2xlarge

For VMware or bare metal, define explicit nodes:

cloud:
vmware:
nodes:
- name: prod-cp-1
role: master
ip: 10.0.1.10
- name: prod-cp-2
role: master
ip: 10.0.1.11
- name: prod-cp-3
role: master
ip: 10.0.1.12
- name: prod-wk-1
role: worker
ip: 10.0.1.20
# ...

Step 3 — Load Balancer Configuration

Option A: VRRP (kube-vip / keepalived)

opencenter:
cluster:
networking:
vrrp_enabled: true
vrrp_ip: "10.0.1.5" # Virtual IP for API server
loadbalancer_provider: ovn

Option B: External Load Balancer (OpenStack Octavia)

opencenter:
cluster:
networking:
use_octavia: true
loadbalancer_provider: octavia

Option C: MetalLB (Bare Metal / VMware)

opencenter:
services:
metallb:
enabled: true
address_pool_start: "10.0.1.200"
address_pool_end: "10.0.1.220"

Step 4 — Private Registry (Harbor)

opencenter:
services:
harbor:
enabled: true
hostname: "registry.prod-cluster.example.com"
storage_type: s3
s3_bucket: prod-harbor-registry
s3_region: us-east-1
registry_volume_size: 500

Configure containerd mirrors on all nodes to pull through Harbor:

opencenter:
cluster:
kubernetes:
containerd_mirrors:
"docker.io":
endpoint: ["https://registry.prod-cluster.example.com/v2/dockerhub"]
"ghcr.io":
endpoint: ["https://registry.prod-cluster.example.com/v2/ghcr"]

Step 5 — TLS Certificates

cert-manager is enabled by default. Configure production issuers:

opencenter:
services:
cert-manager:
enabled: true
email: "platform-team@example.com"
letsencrypt_server: "https://acme-v02.api.letsencrypt.org/directory"

For private CA (air-gap or internal):

opencenter:
services:
cert-manager:
enabled: true
create_cluster_issuer: true
# Use internal CA instead of Let's Encrypt

Step 6 — Backup Configuration (Velero)

opencenter:
services:
velero:
enabled: true
backup_bucket: prod-cluster-backups
region: us-east-1
storage_type: s3

After deployment, create a backup schedule:

velero schedule create daily-backup \
--schedule="0 2 * * *" \
--ttl 720h \
--include-namespaces '*' \
--exclude-namespaces kube-system

Step 7 — Validate and Deploy

# Validate configuration
opencenter cluster validate

# Generate GitOps repository
opencenter cluster generate prod-cluster

# Initialize Git and push
cd ~/prod-cluster-gitops
git init && git add . && git commit -m "Initial production config"
git remote add origin <repo-url>
git push -u origin main

# Deploy
opencenter cluster deploy prod-cluster

Verification

# All control plane nodes Ready
kubectl get nodes -l node-role.kubernetes.io/control-plane

# etcd cluster healthy
kubectl -n kube-system exec etcd-prod-cp-1 -- etcdctl endpoint health --cluster

# All platform services running
kubectl get kustomizations -n flux-system
kubectl get helmreleases -A

# Certificates valid
kubectl get certificates -A

# Backup schedule active
velero schedule get

Post-Deployment

  • Verify load balancer failover (stop one CP node, confirm API remains accessible)
  • Test backup/restore on a non-production namespace
  • Configure alerting rules in Grafana
  • Run kube-bench for CIS compliance validation
  • Document the cluster in your CMDB