Skip to main content

Operations & Recovery

Purpose: For operators, shows how to monitor build progress, update versions, and recover from failures.

Prerequisites

  • opencenter-airgap CLI installed
  • An active or completed build environment

Monitoring Build Progress

Check build status

opencenter-airgap status

This shows each build phase, its state (complete, in-progress, pending, failed), artifact counts, and elapsed time.

Review build logs

Build logs are written to logs/ in the project root. Each build run creates a timestamped log:

ls -lt logs/
# Example: logs/build-2026-02-05T08-18-05.log

tail -f logs/build-*.log # Follow active build

Recovering from Build Failures

Resume a failed build

The CLI checkpoints progress after each phase. If a build fails (network timeout, disk full, rate limit), resume from the last checkpoint:

opencenter-airgap build --resume

The --resume flag skips completed phases and retries from the point of failure.

Clean and rebuild

If the build state is corrupted or you want a fresh start:

opencenter-airgap clean
opencenter-airgap build

The clean command removes downloaded artifacts and build state but preserves config/versions.env and config/components.yaml.

Recover from a partial Zarf deploy in Zone C

If zarf package deploy fails partway through on the bastion:

# Check which components deployed
zarf package list

# Re-run deploy (Zarf is idempotent for completed components)
zarf package deploy zarf-package-opencenter-airgap-amd64-*.tar.zst --confirm

Updating Versions

Patch update (e.g., Kubernetes 1.34.3 → 1.34.4)

  1. Edit config/versions.env:
    KUBERNETES_VERSION="v1.34.4"
  2. Rebuild:
    opencenter-airgap build
  3. Transfer the new package to Zone C and redeploy.

Adding a new platform service

  1. Add the image and chart to the manifest:
    opencenter-airgap add image quay.io/newservice:v1.0
  2. Rebuild with --resume to fetch only the new artifacts.

Bastion Recovery

Registry container stopped

podman start local-registry

# Verify
curl -s http://localhost:35000/v2/_catalog

Registry data lost

If /var/lib/registry is corrupted, redeploy the Zarf package. The k8s-images component reloads all images:

zarf package deploy zarf-package-opencenter-airgap-amd64-*.tar.zst --confirm

Nginx file server down

Check the nginx container or process:

podman ps --filter name=nginx
# If not running:
/opt/opencenter/target-scripts/setup-all.sh

Python venv corrupted

Recreate the virtual environment from the bundled wheels:

python3 -m venv /opt/opencenter/venv
/opt/opencenter/venv/bin/pip install --no-index \
--find-links=/opt/opencenter/python-wheels/ \
ansible jinja2 netaddr

Verification

After any recovery action, run these checks:

# Registry health
curl -s http://localhost:35000/v2/_catalog | python3 -m json.tool

# Nginx serving packages
curl -sI http://localhost:80/

# Python venv functional
/opt/opencenter/venv/bin/ansible --version

# Target node connectivity
ssh deployer@<NODE_IP> "curl -sf http://<BASTION_IP>:35000/v2/ && echo OK"

Troubleshooting

SymptomLikely causeFix
--resume starts from scratchBuild state file deletedRun opencenter-airgap clean then build
Registry returns 500 errorsDisk full on /var/lib/registryFree space or expand the volume
Kubespray fails pulling imagesRegistry mirror not configuredVerify containerd config on target nodes points to bastion
Deploy hangs at k8s-imagesLarge image set, slow disk I/OWait — loading hundreds of images takes time on spinning disks