Longhorn Storage
Purpose: For platform engineers, shows how to configure Longhorn replica count, backup targets, storage classes, and snapshot policies.
What Longhorn Does
Longhorn is a distributed block storage system for Kubernetes. It creates replicated volumes across cluster nodes, providing data redundancy without external storage infrastructure. Longhorn supports snapshots, backups to S3-compatible targets, volume expansion, and a built-in UI for storage management.
How It's Deployed
Longhorn is deployed via FluxCD from openCenter-gitops-base:
openCenter-gitops-base/applications/base/services/longhorn/
├── namespace.yaml
├── source.yaml
├── helmrelease.yaml
└── helm-values/
└── hardened-values.yaml
Customer overlay:
applications/overlays/<cluster>/services/longhorn/
├── kustomization.yaml
└── override-values.yaml
Longhorn requires open-iscsi on each worker node. Kubespray installs this as part of the node preparation.
Key Configuration
Replica Count
The default replica count determines how many copies of each volume are maintained across nodes. The base values set this to 3 for production resilience:
# override-values.yaml
defaultSettings:
defaultReplicaCount: 3
For dev/test clusters with fewer than 3 worker nodes, reduce to 2 or 1.
Storage Classes
Longhorn creates a default StorageClass. To define additional classes with different settings:
persistence:
defaultClass: true
defaultClassReplicaCount: 3
defaultDataLocality: disabled
You can also create StorageClass resources directly:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: longhorn-fast
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
parameters:
numberOfReplicas: "2"
dataLocality: "best-effort"
staleReplicaTimeout: "2880"
Backup Targets
Configure an S3-compatible backup target for disaster recovery:
defaultSettings:
backupTarget: s3://longhorn-backups@us-east-1/
backupTargetCredentialSecret: longhorn-backup-creds
Create the credentials secret (encrypt with SOPS):
kubectl create secret generic longhorn-backup-creds \
-n longhorn-system \
--from-literal=AWS_ACCESS_KEY_ID=<key> \
--from-literal=AWS_SECRET_ACCESS_KEY=<secret> \
--from-literal=AWS_ENDPOINTS=https://s3.example.com
Snapshot and Backup Schedules
Define recurring snapshot and backup jobs per volume or globally:
defaultSettings:
recurringSuccessfulJobsHistoryLimit: 5
recurringFailedJobsHistoryLimit: 3
Create a RecurringJob for scheduled snapshots:
apiVersion: longhorn.io/v1beta2
kind: RecurringJob
metadata:
name: daily-snapshot
namespace: longhorn-system
spec:
cron: "0 2 * * *"
task: snapshot
retain: 7
concurrency: 1
groups:
- default
Verification
# Check Longhorn pods
kubectl get pods -n longhorn-system
# Verify volumes and their replica status
kubectl get volumes.longhorn.io -n longhorn-system
# Check nodes recognized by Longhorn
kubectl get nodes.longhorn.io -n longhorn-system
# Verify StorageClass
kubectl get storageclass | grep longhorn
# Access Longhorn UI (if exposed via Gateway/Ingress)
kubectl get svc -n longhorn-system longhorn-frontend
Troubleshooting
Volume degraded (fewer replicas than expected): Check that enough nodes are schedulable and have available disk space. Longhorn won't schedule replicas on nodes with insufficient storage.
kubectl get replicas.longhorn.io -n longhorn-system | grep <volume-name>
PVC stuck in Pending: Verify the Longhorn CSI driver pods are running and the StorageClass exists.
Common Customizations
- Data locality: Set
dataLocality: best-effortto prefer scheduling replicas on the node where the workload runs, reducing network I/O. - Disk selection: Tag specific disks on nodes for Longhorn use via node annotations.
- Guaranteed engine manager CPU: Reserve CPU for Longhorn engine/replica managers in resource-constrained clusters.
- Volume encryption: Enable LUKS-based encryption per StorageClass for volumes containing sensitive data.
- Resource limits: Adjust Longhorn manager and driver resource requests in
override-values.yaml.