Operations & Recovery
Purpose: For operators, shows how to monitor build progress, update versions, and recover from failures.
Prerequisites
opencenter-airgapCLI installed- An active or completed build environment
Monitoring Build Progress
Check build status
opencenter-airgap status
This shows each build phase, its state (complete, in-progress, pending, failed), artifact counts, and elapsed time.
Review build logs
Build logs are written to logs/ in the project root. Each build run creates a timestamped log:
ls -lt logs/
# Example: logs/build-2026-02-05T08-18-05.log
tail -f logs/build-*.log # Follow active build
Recovering from Build Failures
Resume a failed build
The CLI checkpoints progress after each phase. If a build fails (network timeout, disk full, rate limit), resume from the last checkpoint:
opencenter-airgap build --resume
The --resume flag skips completed phases and retries from the point of failure.
Clean and rebuild
If the build state is corrupted or you want a fresh start:
opencenter-airgap clean
opencenter-airgap build
The clean command removes downloaded artifacts and build state but preserves config/versions.env and config/components.yaml.
Recover from a partial Zarf deploy in Zone C
If zarf package deploy fails partway through on the bastion:
# Check which components deployed
zarf package list
# Re-run deploy (Zarf is idempotent for completed components)
zarf package deploy zarf-package-opencenter-airgap-amd64-*.tar.zst --confirm
Updating Versions
Patch update (e.g., Kubernetes 1.34.3 → 1.34.4)
- Edit
config/versions.env:KUBERNETES_VERSION="v1.34.4" - Rebuild:
opencenter-airgap build - Transfer the new package to Zone C and redeploy.
Adding a new platform service
- Add the image and chart to the manifest:
opencenter-airgap add image quay.io/newservice:v1.0 - Rebuild with
--resumeto fetch only the new artifacts.
Bastion Recovery
Registry container stopped
podman start local-registry
# Verify
curl -s http://localhost:35000/v2/_catalog
Registry data lost
If /var/lib/registry is corrupted, redeploy the Zarf package. The k8s-images component reloads all images:
zarf package deploy zarf-package-opencenter-airgap-amd64-*.tar.zst --confirm
Nginx file server down
Check the nginx container or process:
podman ps --filter name=nginx
# If not running:
/opt/opencenter/target-scripts/setup-all.sh
Python venv corrupted
Recreate the virtual environment from the bundled wheels:
python3 -m venv /opt/opencenter/venv
/opt/opencenter/venv/bin/pip install --no-index \
--find-links=/opt/opencenter/python-wheels/ \
ansible jinja2 netaddr
Verification
After any recovery action, run these checks:
# Registry health
curl -s http://localhost:35000/v2/_catalog | python3 -m json.tool
# Nginx serving packages
curl -sI http://localhost:80/
# Python venv functional
/opt/opencenter/venv/bin/ansible --version
# Target node connectivity
ssh deployer@<NODE_IP> "curl -sf http://<BASTION_IP>:35000/v2/ && echo OK"
Troubleshooting
| Symptom | Likely cause | Fix |
|---|---|---|
--resume starts from scratch | Build state file deleted | Run opencenter-airgap clean then build |
| Registry returns 500 errors | Disk full on /var/lib/registry | Free space or expand the volume |
| Kubespray fails pulling images | Registry mirror not configured | Verify containerd config on target nodes points to bastion |
Deploy hangs at k8s-images | Large image set, slow disk I/O | Wait — loading hundreds of images takes time on spinning disks |