Skip to main content

Reference Architecture: Physical Storage

Purpose: For platform engineers, provides physical storage specifications including disk types, RAID configurations, storage tiers, and performance baselines.

Overview

openCenter workloads have distinct storage performance profiles. etcd requires low-latency writes (< 10 ms fsync). Container images and application data need throughput. Logs and metrics tolerate higher latency but consume large volumes. This document defines the physical disk layout to meet these requirements.

Storage Tiers

TierMediaUse CaseIOPS TargetLatency Target
Tier 0 (Ultra)NVMe SSDetcd, database WAL> 10,000 IOPS< 1 ms
Tier 1 (Performance)SATA/SAS SSDContainer images, OS, application data> 5,000 IOPS< 5 ms
Tier 2 (Capacity)SAS HDD (10K/15K RPM)Logs, metrics long-term, backups> 200 IOPS< 10 ms

Disk Configurations by Server Role

Hypervisor Hosts (Local Storage)

SlotDisk TypeSizeRAIDPurpose
0–1SATA SSD or NVMe M.2480 GB eachRAID 1ESXi / KVM boot volume
2–3NVMe U.2 SSD1.92 TB eachRAID 1Tier 0: etcd, control plane VM disks
4–7NVMe U.2 SSD3.84 TB eachRAID 10Tier 1: worker VM disks, container images

Total usable local storage per host: ~1.9 TB (Tier 0) + ~7.7 TB (Tier 1).

For vSAN deployments, replace the RAID configuration with vSAN disk groups (1 cache + 2–4 capacity disks per group). See Virtual Storage.

Hypervisor Hosts (Shared Storage)

If using external SAN/NAS instead of local storage:

ComponentSpecification
ProtocoliSCSI (10/25 GbE), NFS v4.1, or Fibre Channel (16/32 Gbps)
LUN/Volume per host1–2 datastores, thin provisioned
MultipathMPIO with round-robin policy (iSCSI/FC)
Dedicated NetworkVLAN 30 (Storage), MTU 9000

Management Hosts

SlotDisk TypeSizeRAIDPurpose
0–1SATA SSD480 GB eachRAID 1OS boot
2–3SATA SSD960 GB eachRAID 1vCenter DB, bastion data

RAID Configuration Reference

RAID LevelMin DisksUsable CapacityRead IOPSWrite IOPSUse Case
RAID 1250%2× single1× singleBoot, etcd (write safety)
RAID 53(N-1)/NGoodDegraded (parity)Read-heavy capacity
RAID 64(N-2)/NGoodPoor (double parity)Large capacity, dual fault
RAID 10450%N× singleN/2× singlePerformance + redundancy

RAID 10 is the default recommendation for Kubernetes workloads. RAID 5/6 write penalty is unacceptable for etcd and database workloads.

RAID Controller Settings

SettingValueReason
Write PolicyWrite-Back with BBU/FBUWrite-Through halves write IOPS
Read PolicyRead Ahead (Adaptive)Benefits sequential reads
Stripe Size256 KBMatches typical Kubernetes I/O patterns
CacheEnable (with battery backup)Required for Write-Back safety
Disk CacheDisabledController cache is sufficient; disk cache risks data loss
Patrol ReadEnabled (weekly)Detects latent media errors

If using NVMe drives directly (no RAID controller), configure software RAID via mdadm (Linux) or rely on vSAN/Ceph for redundancy.

Performance Baselines

Test storage performance before deploying Kubernetes. etcd is the most latency-sensitive component.

etcd Storage Validation

Run fio on the target disk to verify it meets etcd requirements:

fio --name=etcd-bench --ioengine=libaio --direct=1 --bs=4k \
--iodepth=1 --rw=write --size=1G --runtime=60 \
--filename=/var/lib/etcd/fio-test --fsync=1
MetricMinimumRecommended
fsync p99 latency< 10 ms< 2 ms
Sequential write IOPS (4K)> 500> 5,000
Sequential write throughput> 50 MB/s> 200 MB/s

If fsync p99 exceeds 10 ms, etcd will log warnings and cluster stability degrades. Use NVMe, not SATA SSD, for etcd volumes.

SAN/NAS Specifications (External Storage)

If deploying shared storage instead of or alongside local disks:

ComponentMinimumRecommended
Array TypeMid-range (Dell PowerStore, NetApp AFF A250)Enterprise (Dell PowerStore 9200, NetApp AFF A800)
ProtocoliSCSI 10 GbEiSCSI 25 GbE or FC 32 Gbps
Cache64 GB256 GB+
DrivesAll-flash SSDAll-flash NVMe
RedundancyDual controllers, dual fabricDual controllers, dual fabric, active-active
Snapshot/CloneRequired for Velero CSI snapshotsRequired

Disk Replacement and Monitoring

  • Configure RAID controller alerts to forward to the monitoring stack (SNMP traps → Prometheus Alertmanager).
  • Set predictive failure thresholds: replace drives when SMART reports reallocated sector count > 0 or wear leveling < 10%.
  • Hot-spare disks: allocate one global hot spare per RAID controller for automatic rebuild.
  • Rebuild time for a 3.84 TB SSD in RAID 10: approximately 2–4 hours depending on controller and load.

Considerations

  • NVMe vs. SATA SSD: NVMe provides 3–5× the IOPS and lower latency than SATA SSD. Use NVMe for any disk hosting etcd or database workloads.
  • Drive endurance: Select drives rated for at least 1 DWPD (Drive Writes Per Day) for mixed workloads, 3 DWPD for write-intensive (etcd, databases).
  • Encryption: Use self-encrypting drives (SED) with OPAL 2.0 if data-at-rest encryption is required at the hardware level. This is in addition to Kubernetes encryption at rest.
  • Capacity planning: Prometheus with 15-second scrape interval and 90-day retention consumes approximately 50–100 GB. Loki log retention at 30 days consumes 100–500 GB depending on log volume. Plan Tier 1/Tier 2 capacity accordingly.
  • vSAN licensing: If using VMware vSAN, local disks must meet the vSAN HCL. Check compatibility before purchasing drives.