Reference Architecture: Physical Storage
Purpose: For platform engineers, provides physical storage specifications including disk types, RAID configurations, storage tiers, and performance baselines.
Overview
openCenter workloads have distinct storage performance profiles. etcd requires low-latency writes (< 10 ms fsync). Container images and application data need throughput. Logs and metrics tolerate higher latency but consume large volumes. This document defines the physical disk layout to meet these requirements.
Storage Tiers
| Tier | Media | Use Case | IOPS Target | Latency Target |
|---|---|---|---|---|
| Tier 0 (Ultra) | NVMe SSD | etcd, database WAL | > 10,000 IOPS | < 1 ms |
| Tier 1 (Performance) | SATA/SAS SSD | Container images, OS, application data | > 5,000 IOPS | < 5 ms |
| Tier 2 (Capacity) | SAS HDD (10K/15K RPM) | Logs, metrics long-term, backups | > 200 IOPS | < 10 ms |
Disk Configurations by Server Role
Hypervisor Hosts (Local Storage)
| Slot | Disk Type | Size | RAID | Purpose |
|---|---|---|---|---|
| 0–1 | SATA SSD or NVMe M.2 | 480 GB each | RAID 1 | ESXi / KVM boot volume |
| 2–3 | NVMe U.2 SSD | 1.92 TB each | RAID 1 | Tier 0: etcd, control plane VM disks |
| 4–7 | NVMe U.2 SSD | 3.84 TB each | RAID 10 | Tier 1: worker VM disks, container images |
Total usable local storage per host: ~1.9 TB (Tier 0) + ~7.7 TB (Tier 1).
For vSAN deployments, replace the RAID configuration with vSAN disk groups (1 cache + 2–4 capacity disks per group). See Virtual Storage.
Hypervisor Hosts (Shared Storage)
If using external SAN/NAS instead of local storage:
| Component | Specification |
|---|---|
| Protocol | iSCSI (10/25 GbE), NFS v4.1, or Fibre Channel (16/32 Gbps) |
| LUN/Volume per host | 1–2 datastores, thin provisioned |
| Multipath | MPIO with round-robin policy (iSCSI/FC) |
| Dedicated Network | VLAN 30 (Storage), MTU 9000 |
Management Hosts
| Slot | Disk Type | Size | RAID | Purpose |
|---|---|---|---|---|
| 0–1 | SATA SSD | 480 GB each | RAID 1 | OS boot |
| 2–3 | SATA SSD | 960 GB each | RAID 1 | vCenter DB, bastion data |
RAID Configuration Reference
| RAID Level | Min Disks | Usable Capacity | Read IOPS | Write IOPS | Use Case |
|---|---|---|---|---|---|
| RAID 1 | 2 | 50% | 2× single | 1× single | Boot, etcd (write safety) |
| RAID 5 | 3 | (N-1)/N | Good | Degraded (parity) | Read-heavy capacity |
| RAID 6 | 4 | (N-2)/N | Good | Poor (double parity) | Large capacity, dual fault |
| RAID 10 | 4 | 50% | N× single | N/2× single | Performance + redundancy |
RAID 10 is the default recommendation for Kubernetes workloads. RAID 5/6 write penalty is unacceptable for etcd and database workloads.
RAID Controller Settings
| Setting | Value | Reason |
|---|---|---|
| Write Policy | Write-Back with BBU/FBU | Write-Through halves write IOPS |
| Read Policy | Read Ahead (Adaptive) | Benefits sequential reads |
| Stripe Size | 256 KB | Matches typical Kubernetes I/O patterns |
| Cache | Enable (with battery backup) | Required for Write-Back safety |
| Disk Cache | Disabled | Controller cache is sufficient; disk cache risks data loss |
| Patrol Read | Enabled (weekly) | Detects latent media errors |
If using NVMe drives directly (no RAID controller), configure software RAID via mdadm (Linux) or rely on vSAN/Ceph for redundancy.
Performance Baselines
Test storage performance before deploying Kubernetes. etcd is the most latency-sensitive component.
etcd Storage Validation
Run fio on the target disk to verify it meets etcd requirements:
fio --name=etcd-bench --ioengine=libaio --direct=1 --bs=4k \
--iodepth=1 --rw=write --size=1G --runtime=60 \
--filename=/var/lib/etcd/fio-test --fsync=1
| Metric | Minimum | Recommended |
|---|---|---|
| fsync p99 latency | < 10 ms | < 2 ms |
| Sequential write IOPS (4K) | > 500 | > 5,000 |
| Sequential write throughput | > 50 MB/s | > 200 MB/s |
If fsync p99 exceeds 10 ms, etcd will log warnings and cluster stability degrades. Use NVMe, not SATA SSD, for etcd volumes.
SAN/NAS Specifications (External Storage)
If deploying shared storage instead of or alongside local disks:
| Component | Minimum | Recommended |
|---|---|---|
| Array Type | Mid-range (Dell PowerStore, NetApp AFF A250) | Enterprise (Dell PowerStore 9200, NetApp AFF A800) |
| Protocol | iSCSI 10 GbE | iSCSI 25 GbE or FC 32 Gbps |
| Cache | 64 GB | 256 GB+ |
| Drives | All-flash SSD | All-flash NVMe |
| Redundancy | Dual controllers, dual fabric | Dual controllers, dual fabric, active-active |
| Snapshot/Clone | Required for Velero CSI snapshots | Required |
Disk Replacement and Monitoring
- Configure RAID controller alerts to forward to the monitoring stack (SNMP traps → Prometheus Alertmanager).
- Set predictive failure thresholds: replace drives when SMART reports reallocated sector count > 0 or wear leveling < 10%.
- Hot-spare disks: allocate one global hot spare per RAID controller for automatic rebuild.
- Rebuild time for a 3.84 TB SSD in RAID 10: approximately 2–4 hours depending on controller and load.
Considerations
- NVMe vs. SATA SSD: NVMe provides 3–5× the IOPS and lower latency than SATA SSD. Use NVMe for any disk hosting etcd or database workloads.
- Drive endurance: Select drives rated for at least 1 DWPD (Drive Writes Per Day) for mixed workloads, 3 DWPD for write-intensive (etcd, databases).
- Encryption: Use self-encrypting drives (SED) with OPAL 2.0 if data-at-rest encryption is required at the hardware level. This is in addition to Kubernetes encryption at rest.
- Capacity planning: Prometheus with 15-second scrape interval and 90-day retention consumes approximately 50–100 GB. Loki log retention at 30 days consumes 100–500 GB depending on log volume. Plan Tier 1/Tier 2 capacity accordingly.
- vSAN licensing: If using VMware vSAN, local disks must meet the vSAN HCL. Check compatibility before purchasing drives.