Reference Architecture: Hypervisor

Purpose: For platform engineers, provides hypervisor selection guidance for the GA production platforms and explains where baremetal fits.

Overview

openCenter's GA production reference architecture centers on VMware vSphere and OpenStack (KVM). Baremetal remains a supported pre-provisioned host path, and Kind stays local-only. AWS is not part of the GA reference architecture.

Supported Hypervisors

Hypervisor / Host Model	Maturity	Use Case	CSI Driver	Cloud Provider
VMware vSphere 7.0+	Production	Enterprise data center	vSphere CSI	vSphere Cloud Provider
OpenStack (KVM) Zed+	Production	Private cloud, multi-tenant	OpenStack Cinder CSI	OpenStack Cloud Provider
Linux baremetal hosts	Production	Existing physical estates	External or platform-selected storage	None
Kind	Development	Local testing, CI/CD	Local path	None

VMware vSphere Configuration

vCenter Server

Setting	Value
vCenter Version	7.0 U3 or 8.0+
Deployment Size	Small (up to 100 hosts) or Medium (up to 400 hosts)
Database	Embedded PostgreSQL (default)
SSO Domain	vsphere.local (or customer domain)
NTP	Sync to same NTP source as ESXi hosts

ESXi Host Configuration

Setting	Value	Reason
ESXi Version	7.0 U3 or 8.0+	Match vCenter version
Scratch Partition	Persistent (local datastore)	Required for log persistence
NTP	Configured, synced	Time skew breaks certificates and etcd
SSH	Disabled (enable only for troubleshooting)	Security hardening
Lockdown Mode	Normal	Prevents direct host access; manage via vCenter
Syslog	Forward to central syslog / Loki	Audit and troubleshooting
Power Management	High Performance	Prevents CPU frequency scaling
NUMA	Expose NUMA topology to VMs	Enables NUMA-aware scheduling

vSphere Cluster Settings

Setting	Value	Notes
HA (High Availability)	Enabled	Restarts VMs on host failure
HA Admission Control	Reserve 1 host capacity	Ensures failover capacity
DRS (Distributed Resource Scheduler)	Enabled, Fully Automated	Balances VM placement across hosts
DRS Migration Threshold	Level 3 (moderate)	Avoids excessive vMotion
EVC (Enhanced vMotion Compatibility)	Set to lowest CPU generation in cluster	Enables vMotion between different CPU models
vSAN	Optional (see Virtual Storage)	If using local storage
Proactive HA	Enabled (if hardware supports)	Migrates VMs before predicted failure

Resource Pools

Create resource pools to isolate Kubernetes roles and prevent resource contention.

Resource Pool	CPU Shares	Memory Shares	CPU Reservation	Memory Reservation
`k8s-control-plane`	High	High	24 GHz (3× 8 vCPU)	48 GB (3× 16 GB)
`k8s-workers`	Normal	Normal	None	None
`infrastructure`	Normal	Normal	4 GHz	8 GB

Reservations on the control plane pool guarantee that etcd and the API server always have CPU and memory, even when worker nodes are under heavy load.

VM Hardware Settings

Setting	Control Plane VM	Worker VM
Hardware Version	vmx-19 (vSphere 7.0 U2+)	vmx-19
Guest OS	Ubuntu 22.04 LTS 64-bit	Ubuntu 22.04 LTS 64-bit
CPU Hot Add	Disabled	Disabled
Memory Hot Add	Disabled	Disabled
Disk Controller	PVSCSI	PVSCSI
Network Adapter	VMXNET3	VMXNET3
Secure Boot	Enabled	Enabled
vTPM	Enabled	Enabled

Disable CPU and Memory Hot Add. These features prevent NUMA optimization and add overhead. Size VMs correctly at creation instead.

OpenStack (KVM) Configuration

OpenStack Services Required

Service	Component	Purpose
Nova	Compute	VM lifecycle management
Neutron	Networking	Virtual networks, security groups
Cinder	Block Storage	Persistent volumes for VMs
Glance	Image	VM image repository
Keystone	Identity	Authentication and authorization
Octavia (optional)	Load Balancer	Kubernetes LoadBalancer services

Nova Compute Configuration

Setting	Value	Reason
Hypervisor	KVM (libvirt)	Native performance
CPU Mode	`host-passthrough`	Exposes host CPU features to VMs
CPU Overcommit Ratio	1.5:1 (max)	Conservative for Kubernetes
Memory Overcommit Ratio	1.0:1 (none)	Kubernetes assumes dedicated memory
Huge Pages	2 MB (enabled)	Reduces TLB misses for large VMs
NUMA Topology	Expose to instances	Enables NUMA-aware placement

Flavor Definitions

Create dedicated flavors for Kubernetes nodes:

Flavor Name	vCPU	Memory	Root Disk	Properties
`oc.cp.small`	4	8 GB	100 GB	`hw:cpu_policy=dedicated`, `hw:mem_page_size=2048`
`oc.cp.medium`	8	16 GB	200 GB	`hw:cpu_policy=dedicated`, `hw:mem_page_size=2048`
`oc.cp.large`	16	32 GB	500 GB	`hw:cpu_policy=dedicated`, `hw:mem_page_size=2048`
`oc.worker.general`	4	16 GB	100 GB	`hw:mem_page_size=2048`
`oc.worker.compute`	8	16 GB	100 GB	`hw:cpu_policy=dedicated`
`oc.worker.memory`	4	32 GB	100 GB	`hw:mem_page_size=2048`

Use hw:cpu_policy=dedicated for control plane nodes to pin vCPUs to physical cores. This eliminates CPU scheduling jitter that affects etcd latency.

Security Groups

Rule	Protocol	Port	Source	Purpose
SSH	TCP	22	Bastion SG	Node access
Kubernetes API	TCP	6443	Worker SG, Bastion SG	API server
etcd	TCP	2379–2380	Control Plane SG	etcd cluster
Kubelet	TCP	10250	Control Plane SG	Kubelet API
NodePort	TCP	30000–32767	Load Balancer SG	Service exposure
Calico BGP	TCP	179	Node SG	CNI networking
VXLAN	UDP	4789	Node SG	Overlay network

Anti-Affinity Rules

Distribute Kubernetes control plane VMs across physical hosts to survive host failures.

vSphere

Create a DRS anti-affinity rule:

Rule type: "Separate Virtual Machines"
Members: all 3 control plane VMs
This ensures DRS never places two control plane VMs on the same ESXi host.

OpenStack

Use server groups with anti-affinity policy:

openstack server group create --policy anti-affinity k8s-control-plane

Reference the server group when creating control plane instances.

Considerations

Licensing: vSphere requires per-CPU licensing (vSphere Standard or Enterprise Plus). Enterprise Plus is required for DRS, vSAN, and distributed switches. OpenStack has no licensing cost but requires operational expertise.
Patching: Schedule ESXi/KVM host patching during maintenance windows. Use vSphere Update Manager (VUM) or OpenStack rolling upgrades to patch one host at a time while VMs migrate to remaining hosts.
Backup: Back up vCenter Server Appliance (VCSA) configuration regularly. For OpenStack, back up the control plane databases (MariaDB/Galera, RabbitMQ).
Monitoring: Deploy the vSphere Prometheus exporter or OpenStack exporter to feed hypervisor metrics into the Grafana stack.
Nested virtualization: Not supported for production. Kind clusters for development run inside VMs but do not use nested KVM/VT-x.

Overview​

Supported Hypervisors​

VMware vSphere Configuration​

vCenter Server​

ESXi Host Configuration​

vSphere Cluster Settings​

Resource Pools​

VM Hardware Settings​

OpenStack (KVM) Configuration​

OpenStack Services Required​

Nova Compute Configuration​

Flavor Definitions​

Security Groups​

Anti-Affinity Rules​

vSphere​

OpenStack​

Considerations​