Helm Release Rollback

Purpose: For operators, shows how to diagnose failed HelmRelease upgrades and roll back to previous revision.

Prerequisites

kubectl access to the cluster
helm CLI installed
flux CLI installed

Diagnosing a Failed Upgrade

Step 1: Check HelmRelease Status

flux get helmreleases -A

Look for releases with Ready: False and note the error message.

Step 2: Get Detailed Events

kubectl describe helmrelease <name> -n <namespace>

The Events section shows the failure reason. Common causes:

Invalid Helm values (template rendering error)
Failed pre-upgrade hooks
Resource conflicts (another controller owns the resource)
Timeout waiting for pods to become ready

Step 3: Check Helm History

helm history <release-name> -n <namespace>

This shows all revisions with their status (deployed, failed, superseded).

Rolling Back

Option A: Revert the Git Commit

The GitOps approach — revert the commit that introduced the bad values and push. FluxCD reconciles to the previous state automatically.

git revert <commit-hash>
git push

This is the preferred method because it keeps Git as the source of truth.

Option B: Manual Helm Rollback

If you need an immediate fix before the Git revert propagates:

Suspend FluxCD reconciliation to prevent it from re-applying the bad state:

flux suspend helmrelease <name> -n <namespace>

Roll back to the last working revision:

helm rollback <release-name> <revision> -n <namespace>

Fix the values in Git, commit, and push.
Resume reconciliation:

flux resume helmrelease <name> -n <namespace>

Skipping step 1 causes FluxCD to re-apply the failed values on the next reconciliation interval.

Option C: Force Upgrade with Fixed Values

If the release is stuck in a failed state and rollback does not work:

Suspend the HelmRelease
Uninstall the release manually:

helm uninstall <release-name> -n <namespace>

Fix the values in Git and push
Resume the HelmRelease — FluxCD performs a fresh install

This approach causes downtime for the affected service.

Preventing Failed Upgrades

Test value changes locally with helm template before committing
Use opencenter cluster validate to catch schema issues
Pin chart versions in HelmRelease specs — avoid floating tags
Set spec.upgrade.remediation.retries to allow automatic retry on transient failures

Checking Post-Rollback Health

After rollback, verify the service is healthy:

kubectl get pods -n <namespace>
helm status <release-name> -n <namespace>
flux get helmrelease <name> -n <namespace>

Prerequisites​

Diagnosing a Failed Upgrade​

Step 1: Check HelmRelease Status​

Step 2: Get Detailed Events​

Step 3: Check Helm History​

Rolling Back​

Option A: Revert the Git Commit​

Option B: Manual Helm Rollback​

Option C: Force Upgrade with Fixed Values​

Preventing Failed Upgrades​

Checking Post-Rollback Health​