Skip to main content

Longhorn Storage

Purpose: For platform engineers, shows how to configure Longhorn replica count, backup targets, storage classes, and snapshot policies.

What Longhorn Does

Longhorn is a distributed block storage system for Kubernetes. It creates replicated volumes across cluster nodes, providing data redundancy without external storage infrastructure. Longhorn supports snapshots, backups to S3-compatible targets, volume expansion, and a built-in UI for storage management.

How It's Deployed

Longhorn is deployed via FluxCD from openCenter-gitops-base:

openCenter-gitops-base/applications/base/services/longhorn/
├── namespace.yaml
├── source.yaml
├── helmrelease.yaml
└── helm-values/
└── hardened-values.yaml

Customer overlay:

applications/overlays/<cluster>/services/longhorn/
├── kustomization.yaml
└── override-values.yaml

Longhorn requires open-iscsi on each worker node. Kubespray installs this as part of the node preparation.

Key Configuration

Replica Count

The default replica count determines how many copies of each volume are maintained across nodes. The base values set this to 3 for production resilience:

# override-values.yaml
defaultSettings:
defaultReplicaCount: 3

For dev/test clusters with fewer than 3 worker nodes, reduce to 2 or 1.

Storage Classes

Longhorn creates a default StorageClass. To define additional classes with different settings:

persistence:
defaultClass: true
defaultClassReplicaCount: 3
defaultDataLocality: disabled

You can also create StorageClass resources directly:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: longhorn-fast
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
parameters:
numberOfReplicas: "2"
dataLocality: "best-effort"
staleReplicaTimeout: "2880"

Backup Targets

Configure an S3-compatible backup target for disaster recovery:

defaultSettings:
backupTarget: s3://longhorn-backups@us-east-1/
backupTargetCredentialSecret: longhorn-backup-creds

Create the credentials secret (encrypt with SOPS):

kubectl create secret generic longhorn-backup-creds \
-n longhorn-system \
--from-literal=AWS_ACCESS_KEY_ID=<key> \
--from-literal=AWS_SECRET_ACCESS_KEY=<secret> \
--from-literal=AWS_ENDPOINTS=https://s3.example.com

Snapshot and Backup Schedules

Define recurring snapshot and backup jobs per volume or globally:

defaultSettings:
recurringSuccessfulJobsHistoryLimit: 5
recurringFailedJobsHistoryLimit: 3

Create a RecurringJob for scheduled snapshots:

apiVersion: longhorn.io/v1beta2
kind: RecurringJob
metadata:
name: daily-snapshot
namespace: longhorn-system
spec:
cron: "0 2 * * *"
task: snapshot
retain: 7
concurrency: 1
groups:
- default

Verification

# Check Longhorn pods
kubectl get pods -n longhorn-system

# Verify volumes and their replica status
kubectl get volumes.longhorn.io -n longhorn-system

# Check nodes recognized by Longhorn
kubectl get nodes.longhorn.io -n longhorn-system

# Verify StorageClass
kubectl get storageclass | grep longhorn

# Access Longhorn UI (if exposed via Gateway/Ingress)
kubectl get svc -n longhorn-system longhorn-frontend

Troubleshooting

Volume degraded (fewer replicas than expected): Check that enough nodes are schedulable and have available disk space. Longhorn won't schedule replicas on nodes with insufficient storage.

kubectl get replicas.longhorn.io -n longhorn-system | grep <volume-name>

PVC stuck in Pending: Verify the Longhorn CSI driver pods are running and the StorageClass exists.

Common Customizations

  • Data locality: Set dataLocality: best-effort to prefer scheduling replicas on the node where the workload runs, reducing network I/O.
  • Disk selection: Tag specific disks on nodes for Longhorn use via node annotations.
  • Guaranteed engine manager CPU: Reserve CPU for Longhorn engine/replica managers in resource-constrained clusters.
  • Volume encryption: Enable LUKS-based encryption per StorageClass for volumes containing sensitive data.
  • Resource limits: Adjust Longhorn manager and driver resource requests in override-values.yaml.