Migration Planning & Execution

Purpose: For operators, shows how to plan and execute workload migration between infrastructure providers.

When Migration Is Needed

Changing infrastructure providers (e.g., VMware → OpenStack, on-prem → cloud)
Kubernetes version jump too large for in-place upgrade (e.g., 1.26 → 1.30)
Cluster architecture change (different node sizing, network topology, storage backend)
Consolidating multiple clusters into one

Planning Checklist

Before starting, document answers to these questions:

Workload inventory — What namespaces, deployments, statefulsets, and CRDs exist on the source cluster?
Data dependencies — Which workloads have PersistentVolumeClaims? What is the total data size?
External integrations — DNS records, load balancer IPs, TLS certificates, external database connections.
Downtime tolerance — Can workloads run on both clusters simultaneously (blue-green), or is a cutover window required?
Rollback plan — If the migration fails, how do you revert to the source cluster?

# Generate a workload inventory from the source cluster
kubectl get deployments,statefulsets,daemonsets -A -o wide > workload-inventory.txt
kubectl get pvc -A -o wide > pvc-inventory.txt
kubectl get ingress,httproute -A > ingress-inventory.txt

Step 1: Provision the Target Cluster

Use openCenter CLI to initialize and provision the new cluster:

opencenter cluster init <new-cluster> --org <org-id> --type <provider>
opencenter cluster edit <new-cluster>
opencenter cluster setup <new-cluster>

# Provision infrastructure
cd infrastructure/clusters/<new-cluster>/
terraform apply

# Deploy Kubernetes
cd inventory/
ansible-playbook -i inventory.yaml -b --become-user=root cluster.yml

# Bootstrap FluxCD
opencenter cluster bootstrap <new-cluster>

Wait for all platform services to reconcile before proceeding.

Step 2: Migrate Platform Configuration

The target cluster gets its platform services from openCenter-gitops-base via FluxCD. Verify that all services match the source cluster:

# Compare service versions between clusters
flux get helmreleases -A --context=source-cluster
flux get helmreleases -A --context=target-cluster

Copy any cluster-specific overrides from the source overlay to the target overlay directory.

Step 3: Migrate Workloads

Option A: Blue-Green (Minimal Downtime)

Deploy application manifests to the target cluster (add GitRepository sources pointing to app repos).
Run both clusters simultaneously.
Migrate persistent data using Velero or application-level tools (see Data Portability).
Switch DNS/load balancer to point to the target cluster.
Verify traffic flows to the target.
Decommission the source cluster.

Option B: Cutover Window

Take a Velero backup of the source cluster.
Scale down workloads on the source cluster.
Restore the Velero backup on the target cluster.
Update DNS/load balancer to point to the target.
Verify and scale up on the target.

# Source cluster: backup everything
velero backup create full-migration --wait

# Target cluster: restore
velero restore create --from-backup full-migration --wait

Step 4: Update DNS and External References

After workloads are running on the target cluster:

# Update DNS records to point to new cluster ingress/load balancer IPs
# Update any external services that reference the old cluster API server endpoint
# Update CI/CD pipeline kubeconfig references

Step 5: Validate

# Verify all workloads are running on the target
kubectl get pods -A --context=target-cluster | grep -v Running

# Check ingress/routes are responding
curl -I https://app.example.com

# Run drift detection on the target
opencenter cluster drift <new-cluster>

# Verify Velero backups are running on the target
velero backup get --context=target-cluster

Step 6: Decommission the Source Cluster

After the target cluster is stable (recommended: run both for at least 48 hours):

# Remove FluxCD from the source to stop reconciliation
flux uninstall --context=source-cluster

# Destroy infrastructure
cd infrastructure/clusters/<old-cluster>/
terraform destroy

Troubleshooting

PVC restore fails on target — Storage class names may differ between providers. Create a StorageClass on the target that matches the source, or use Velero's --restore-volumes with a storage class mapping.
DNS propagation delay — TTL on DNS records can cause traffic to hit the old cluster. Lower TTL before migration, or use weighted routing during cutover.
CRDs missing on target — If the source cluster has custom CRDs not in gitops-base, export and apply them to the target before restoring workloads.

When Migration Is Needed​

Planning Checklist​

Step 1: Provision the Target Cluster​

Step 2: Migrate Platform Configuration​

Step 3: Migrate Workloads​

Option A: Blue-Green (Minimal Downtime)​

Option B: Cutover Window​

Step 4: Update DNS and External References​

Step 5: Validate​

Step 6: Decommission the Source Cluster​

Troubleshooting​