Skip to main content

Reference Architecture: Infrastructure Services

Purpose: For platform engineers, provides specifications for shared infrastructure services required before Kubernetes deployment.

Overview

Kubernetes clusters depend on several infrastructure services that must be operational before cluster provisioning begins. DNS resolution failures, NTP drift, or DHCP exhaustion cause cluster instability that is difficult to diagnose after the fact. This document defines the requirements for each service.

Service Dependency Order

Provision these services in order. Each service depends on the ones above it.

  1. NTP — Time synchronization (no dependencies)
  2. DNS — Name resolution (depends on NTP for DNSSEC)
  3. DHCP / IPAM — IP address assignment (depends on DNS for registration)
  4. LDAP / Active Directory — Identity (depends on DNS, NTP)
  5. Certificate Authority — TLS certificates (depends on NTP, DNS)
  6. HTTP Proxy / Mirror — Package repositories (depends on DNS)

NTP (Network Time Protocol)

Time skew greater than 500 ms between nodes causes etcd leader election failures, TLS certificate validation errors, and log correlation problems.

Requirements

SpecificationValue
ProtocolNTP (UDP 123) or Chrony
Servers2 internal NTP servers minimum (for redundancy)
StratumStratum 2 or better (synced to public Stratum 1 or GPS)
Max Skew< 100 ms between any two cluster nodes
Client Configchrony (preferred on Ubuntu/RHEL) or systemd-timesyncd

Configuration Targets

ComponentNTP Source
ESXi / KVM hostsInternal NTP servers
vCenter ServerInternal NTP servers
Kubernetes VMsInternal NTP servers (not host-level sync)
Network switchesInternal NTP servers
Storage arraysInternal NTP servers

Configure Kubernetes VMs to use NTP directly, not VMware Tools time sync. VMware Tools time sync can cause time jumps that confuse etcd.

Verification

# On each node
chronyc tracking
# Verify "System time" offset is < 100 ms
# Verify "Leap status" is "Normal"

DNS (Domain Name System)

Kubernetes requires forward and reverse DNS resolution for node hostnames. The API server, etcd, and kubelet all use DNS for peer discovery and certificate validation.

Requirements

SpecificationValue
Servers2 DNS servers minimum (primary + secondary)
SoftwareBIND 9, Unbound, Windows DNS, or InfoBlox
ZonesForward zone for cluster domain, reverse zone for node subnets
Record TypesA, PTR, CNAME, SRV
TTL300 seconds (5 min) for cluster records
DNSSECOptional (requires accurate NTP)

Required DNS Records

Create these records before running opencenter cluster setup:

RecordTypeExamplePurpose
Control plane nodesAcp01.k8s.example.com → 10.0.40.10Node identity
Worker nodesAwk01.k8s.example.com → 10.0.40.20Node identity
API server VIPAapi.k8s.example.com → 10.0.40.100kubectl, CI/CD access
Wildcard ingressA / CNAME*.apps.k8s.example.com → 10.0.40.101Application ingress
Reverse PTRPTR10.0.40.10 → cp01.k8s.example.comReverse lookup (audit logs)
vCenterAvcenter.example.com → 10.0.10.5vSphere management
BastionAbastion.k8s.example.com → 10.0.40.200SSH jump host

Internal vs. External DNS

ZoneScopeResolver
k8s.example.comInternal (data center)Internal DNS servers
cluster.localKubernetes internalCoreDNS (deployed by Kubespray)
External domainsInternet resolutionInternal DNS with forwarders to upstream

CoreDNS inside the cluster handles cluster.local service discovery. It forwards all other queries to the infrastructure DNS servers configured in each node's /etc/resolv.conf.

DHCP and IPAM

Static vs. DHCP

ComponentIP AssignmentReason
Control plane nodesStaticStable IPs for etcd, API server certificates
Worker nodesStatic or DHCP reservationPredictable addressing for DNS records
BastionStaticKnown SSH target
BMC/IPMIStatic or DHCP reservationOut-of-band access
Pod networkCalico-managed (IPAM)Cluster-internal, not infrastructure DHCP

Static IPs are preferred for all Kubernetes nodes. If using DHCP, use MAC-based reservations to ensure consistent IP assignment.

IPAM Planning

SubnetCIDRUsable IPsAssignment
Management10.0.10.0/24254ESXi hosts, vCenter, switches
VM Network10.0.40.0/24254Kubernetes nodes, bastion
Pod CIDR10.244.0.0/1665,534Calico IPAM (per-node /24 blocks)
Service CIDR10.96.0.0/121,048,574Kubernetes service ClusterIPs
MetalLB Pool10.0.40.100–10.0.40.12021LoadBalancer service IPs

Reserve the MetalLB IP range in IPAM to prevent conflicts. These IPs must not be assigned to any other device.

LDAP / Active Directory

Keycloak (deployed by openCenter) integrates with LDAP or Active Directory for user authentication. The infrastructure team must provide:

RequirementValue
ProtocolLDAPS (TCP 636) — LDAP over TLS
Server2 domain controllers minimum
Base DNdc=example,dc=com
Bind AccountService account with read-only access to user/group OUs
User Search Baseou=Users,dc=example,dc=com
Group Search Baseou=Groups,dc=example,dc=com
Group for Cluster Adminscn=k8s-admins,ou=Groups,dc=example,dc=com
Group for Viewerscn=k8s-viewers,ou=Groups,dc=example,dc=com

Keycloak maps LDAP groups to Kubernetes RBAC roles via RBAC Manager. See the Keycloak and RBAC Manager platform service documentation for configuration details.

Certificate Authority (CA)

TLS certificates are required for the Kubernetes API server, etcd, ingress, and platform services. cert-manager (deployed by openCenter) automates certificate issuance.

Options

CA TypeUse CaseIntegration
Internal CA (enterprise PKI)Regulated environmentscert-manager ACME or CA issuer
Self-signed CALab / developmentcert-manager self-signed issuer
Let's EncryptInternet-facing clusterscert-manager ACME issuer

Requirements for Internal CA

RequirementValue
ProtocolACME (preferred) or manual CSR signing
Root CATrusted by all cluster nodes (installed in OS trust store)
Intermediate CADedicated intermediate for Kubernetes certificates
Key AlgorithmRSA 2048+ or ECDSA P-256
Certificate Lifetime90 days (auto-renewed by cert-manager)
CRL / OCSPAccessible from cluster nodes

Distribute the root CA certificate to all Kubernetes nodes during provisioning (via Kubespray extra_certs variable or cloud-init).

HTTP Proxy / Package Mirror

For clusters with restricted internet access (not fully air-gapped):

ServicePurposeSoftware
HTTP ProxyRoute outbound traffic through a controlled egress pointSquid, Zscaler
APT/YUM MirrorCache OS packages locallyAptly, Pulp, Nexus
Container Registry MirrorCache container imagesHarbor (deployed by openCenter), Nexus

Proxy Configuration

If using an HTTP proxy, configure these environment variables on all Kubernetes nodes:

VariableValue
http_proxyhttp://proxy.example.com:3128
https_proxyhttp://proxy.example.com:3128
no_proxy10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,.example.com,.cluster.local,.svc

The no_proxy list must include all internal CIDRs, the cluster domain, and the Kubernetes service domain. Missing entries cause internal traffic to route through the proxy and fail.

For fully air-gapped deployments, use openCenter-AirGap instead of a proxy. See the air-gap documentation.

Considerations

  • Service availability: DNS and NTP are single points of failure for the entire cluster. Deploy at least two instances of each, on separate hosts, in separate failure domains.
  • Monitoring: Monitor DNS query latency, NTP offset, and DHCP lease utilization. Feed metrics into Prometheus via exporters (bind_exporter, chrony_exporter, dhcp_exporter).
  • Change management: DNS record changes and IPAM updates affect cluster operations. Use a change management process (ticket + approval) for production DNS zones.
  • Firewall rules: Ensure all Kubernetes nodes can reach DNS (TCP/UDP 53), NTP (UDP 123), LDAPS (TCP 636), and the HTTP proxy (TCP 3128) from the VM network VLAN.
  • Documentation: Maintain a network services runbook listing server IPs, credentials (in a vault), and escalation contacts for each infrastructure service.