Skip to main content

First VMware Cluster

Purpose: For platform engineers, walks through deploying a first production-ready Kubernetes cluster on VMware vSphere using openCenter (15 minutes configuration + 40–60 minutes deployment).

What You'll Do

  1. Initialize a VMware cluster configuration with the CLI
  2. Validate, generate the GitOps repository, and deploy
  3. Verify nodes, vSphere CSI, and platform services are running

End result: A 6-node HA Kubernetes cluster on vSphere with vSphere CSI storage, Kyverno policies, monitoring, and GitOps — ready for application workloads (~60 minutes total).

Prerequisites

  • openCenter CLI installed (CLI Installation)
  • VMware vSphere 7.0+ with admin credentials
  • vCenter access with permissions to create VMs, networks, and datastores
  • A VM template (Ubuntu 22.04 recommended) uploaded to vSphere
  • DNS records or a wildcard domain for ingress
  • SSH key pair for node access (or let the CLI generate one)
  • A Git repository for GitOps (GitHub or GitLab)

Step 1: Initialize the Cluster Configuration

opencenter cluster init my-vmware-cluster --org my-org --type vmware

This creates the configuration at ~/.config/opencenter/clusters/my-org/.my-vmware-cluster-config.yaml and auto-generates SOPS Age keys and an SSH key pair.

Edit the configuration:

opencenter cluster edit my-vmware-cluster

Key sections to configure:

opencenter:
cluster:
cluster_name: my-vmware-cluster
organization: my-org

infrastructure:
provider: vmware
cloud:
vmware:
vcenter_host: vcenter.example.com
username: administrator@vsphere.local
password: "${VSPHERE_PASSWORD}"
datacenter: DC1
cluster: Cluster01
datastore: datastore1
network: VM Network
template: ubuntu-22.04-template
resource_pool: openCenter

kubernetes:
version: 1.33.5
control_plane_count: 3
worker_count: 3
cni: calico

services:
keycloak:
enabled: true
kube-prometheus-stack:
enabled: true
loki:
enabled: true
velero:
enabled: true
vsphere-csi:
enabled: true

secrets:
sops:
age_keys:
- age1... # Auto-generated during init

Step 2: Validate Configuration

opencenter cluster validate my-vmware-cluster

Validation checks: schema compliance, provider-specific rules (vCenter connectivity, datastore existence, template availability), and business rules (node count ≥ 3 for HA).

Step 3: Generate GitOps Repository

opencenter cluster generate my-vmware-cluster

This generates:

  • Terraform/OpenTofu infrastructure code for VM provisioning
  • Kubespray inventory with security hardening (k8s_hardening.yml)
  • FluxCD application manifests (GitRepository sources referencing openCenter-gitops-base)
  • SOPS-encrypted secrets

Step 4: Deploy the Cluster

opencenter cluster deploy my-vmware-cluster

The deploy command:

  1. Provisions VMs in vSphere via Terraform (10–15 minutes)
  2. Installs Kubernetes via Kubespray with containerd, etcd HA, and security hardening (20–30 minutes)
  3. Bootstraps FluxCD which reconciles platform services from openCenter-gitops-base (10–15 minutes)

Step 5: Verify the Cluster

# Check cluster status
opencenter cluster status my-vmware-cluster

# Verify nodes
kubectl get nodes

# Check vSphere CSI driver
kubectl get pods -n vmware-system-csi

# Confirm FluxCD reconciliation
flux get kustomizations

# Check storage classes
kubectl get sc

Check Your Work

  • All nodes show Ready status
  • vSphere CSI driver pods are running in vmware-system-csi namespace
  • FluxCD kustomizations show Ready=True
  • cert-manager is issuing certificates (kubectl get cert -A)
  • kubectl get sc shows vSphere storage class as default

Platform Services Deployed

After FluxCD reconciles, these services from openCenter-gitops-base are running:

ServiceVersionNamespace
cert-managerv1.18.2cert-manager
Gateway API (Envoy)latestenvoy-gateway-system
Keycloak26.4.2keycloak
Kyverno3.6.0kyverno
kube-prometheus-stack77.6.0observability
Loki6.45.2observability
Velero10.1.1velero
vSphere CSI3.8.1vmware-system-csi

Troubleshooting

SymptomLikely CauseFix
VMs fail to createInsufficient datastore space or wrong templateVerify datastore and template names match vCenter exactly
Nodes not joiningFirewall blocking port 6443Verify VM Network allows inter-node traffic
CSI volumes not provisioningWrong datastore configurationCheck CSI driver logs: kubectl logs -n vmware-system-csi -l app=vsphere-csi-controller
FluxCD stuck reconcilingGit auth failureVerify deploy key: flux get sources git

Next Steps