First VMware Cluster
Purpose: For platform engineers, walks through deploying a first production-ready Kubernetes cluster on VMware vSphere using openCenter (15 minutes configuration + 40–60 minutes deployment).
What You'll Do
- Initialize a VMware cluster configuration with the CLI
- Validate, generate the GitOps repository, and deploy
- Verify nodes, vSphere CSI, and platform services are running
End result: A 6-node HA Kubernetes cluster on vSphere with vSphere CSI storage, Kyverno policies, monitoring, and GitOps — ready for application workloads (~60 minutes total).
Prerequisites
- openCenter CLI installed (CLI Installation)
- VMware vSphere 7.0+ with admin credentials
- vCenter access with permissions to create VMs, networks, and datastores
- A VM template (Ubuntu 22.04 recommended) uploaded to vSphere
- DNS records or a wildcard domain for ingress
- SSH key pair for node access (or let the CLI generate one)
- A Git repository for GitOps (GitHub or GitLab)
Step 1: Initialize the Cluster Configuration
opencenter cluster init my-vmware-cluster --org my-org --type vmware
This creates the configuration at ~/.config/opencenter/clusters/my-org/.my-vmware-cluster-config.yaml and auto-generates SOPS Age keys and an SSH key pair.
Edit the configuration:
opencenter cluster edit my-vmware-cluster
Key sections to configure:
opencenter:
cluster:
cluster_name: my-vmware-cluster
organization: my-org
infrastructure:
provider: vmware
cloud:
vmware:
vcenter_host: vcenter.example.com
username: administrator@vsphere.local
password: "${VSPHERE_PASSWORD}"
datacenter: DC1
cluster: Cluster01
datastore: datastore1
network: VM Network
template: ubuntu-22.04-template
resource_pool: openCenter
kubernetes:
version: 1.33.5
control_plane_count: 3
worker_count: 3
cni: calico
services:
keycloak:
enabled: true
kube-prometheus-stack:
enabled: true
loki:
enabled: true
velero:
enabled: true
vsphere-csi:
enabled: true
secrets:
sops:
age_keys:
- age1... # Auto-generated during init
Step 2: Validate Configuration
opencenter cluster validate my-vmware-cluster
Validation checks: schema compliance, provider-specific rules (vCenter connectivity, datastore existence, template availability), and business rules (node count ≥ 3 for HA).
Step 3: Generate GitOps Repository
opencenter cluster generate my-vmware-cluster
This generates:
- Terraform/OpenTofu infrastructure code for VM provisioning
- Kubespray inventory with security hardening (
k8s_hardening.yml) - FluxCD application manifests (GitRepository sources referencing openCenter-gitops-base)
- SOPS-encrypted secrets
Step 4: Deploy the Cluster
opencenter cluster deploy my-vmware-cluster
The deploy command:
- Provisions VMs in vSphere via Terraform (10–15 minutes)
- Installs Kubernetes via Kubespray with containerd, etcd HA, and security hardening (20–30 minutes)
- Bootstraps FluxCD which reconciles platform services from openCenter-gitops-base (10–15 minutes)
Step 5: Verify the Cluster
# Check cluster status
opencenter cluster status my-vmware-cluster
# Verify nodes
kubectl get nodes
# Check vSphere CSI driver
kubectl get pods -n vmware-system-csi
# Confirm FluxCD reconciliation
flux get kustomizations
# Check storage classes
kubectl get sc
Check Your Work
- All nodes show
Readystatus - vSphere CSI driver pods are running in
vmware-system-csinamespace - FluxCD kustomizations show
Ready=True - cert-manager is issuing certificates (
kubectl get cert -A) -
kubectl get scshows vSphere storage class as default
Platform Services Deployed
After FluxCD reconciles, these services from openCenter-gitops-base are running:
| Service | Version | Namespace |
|---|---|---|
| cert-manager | v1.18.2 | cert-manager |
| Gateway API (Envoy) | latest | envoy-gateway-system |
| Keycloak | 26.4.2 | keycloak |
| Kyverno | 3.6.0 | kyverno |
| kube-prometheus-stack | 77.6.0 | observability |
| Loki | 6.45.2 | observability |
| Velero | 10.1.1 | velero |
| vSphere CSI | 3.8.1 | vmware-system-csi |
Troubleshooting
| Symptom | Likely Cause | Fix |
|---|---|---|
| VMs fail to create | Insufficient datastore space or wrong template | Verify datastore and template names match vCenter exactly |
| Nodes not joining | Firewall blocking port 6443 | Verify VM Network allows inter-node traffic |
| CSI volumes not provisioning | Wrong datastore configuration | Check CSI driver logs: kubectl logs -n vmware-system-csi -l app=vsphere-csi-controller |
| FluxCD stuck reconciling | Git auth failure | Verify deploy key: flux get sources git |
Next Steps
- Deploy Your First Application — Ship a workload via GitOps
- Day 2 Operations — Upgrades, drift detection, backups
- Secrets Management — Key rotation and lifecycle