Edit

Troubleshooting

Purpose: For platform engineers and field operators, lists the failure modes the project actually hits in production and how to recover. Errors are grouped by phase.

Build phase

Configuration not found: config/versions.env

The build was started without running init first.

opencenter-airgap init
$EDITOR config/versions.env
opencenter-airgap build

Network downloads timing out

assets/ is downloaded from external registries. Symptoms include connection timeout, EOF, or 429 Too Many Requests.

# Use a proxy if your network requires one.
export HTTPS_PROXY=http://proxy.example.com:8080
export NO_PROXY=localhost,127.0.0.1

# Resume; the orchestrator restarts from the last completed step.
opencenter-airgap build

If a public registry is rate-limiting you, slow the build down by reducing parallelism or wait and retry. The build is idempotent — partial state survives across retries.

No space left on device

Peak usage during a build is ~120 GB across build/, assets/, and dist/.

df -h
opencenter-airgap clean              # remove build/ and dist/
# or move the build to a larger volume:
opencenter-airgap build --state /mnt/big/state.json

Component name '…​' is invalid

The schema enforces lowercase alphanumerics with hyphens, starting with a letter. Rename the component:

# Bad
- name: My_Tool

# Good
- name: my-tool

Image '…​:latest' uses mutable tag

Pin the image to a specific tag. The build refuses :latest and untagged images.

config hash mismatch — refusing to resume

versions.env or components.yaml changed since the failed build started.

opencenter-airgap status              # confirm
opencenter-airgap build --clean       # full rerun

zarf: not found

The Zarf CLI is not installed on the build host. Install it from https://zarf.dev/install/ and re-run. If you do not need the .tar.zst package right now, the rest of the build still runs and you keep zarf.yaml, assets/, and dist/artifact-manifest.json.

cosign generate-key-pair failed

Cosign is not installed.

# macOS
brew install cosign

# Linux
curl -sSfL https://github.com/sigstore/cosign/releases/latest/download/cosign-linux-amd64 \
  -o /usr/local/bin/cosign
chmod +x /usr/local/bin/cosign

Deploy phase

failed to extract package: unexpected EOF

The package was corrupted in transit.

sha256sum -c zarf-package-*.tar.zst.sha256

If checksum fails, re-copy the package from dist/ on the build host. If checksum passes but extraction still fails, transfer with a tool that does its own integrity check (e.g. rsync --checksum).

Container registry will not start

Check the registry container directly:

podman logs local-registry
ss -tlnp | grep ':5000\b'   # is anything else on the port?
df -h /var/lib/registry

Common causes are port 5000 already in use, a previous run leaving an unhealthy container, or insufficient disk for the image layer cache.

Nginx file server returns 404 on /debs/ or /pypi/

The bind mounts did not pick up the unpacked assets.

docker inspect opencenter-nginx --format '{{range .Mounts}}{{.Source}} -> {{.Destination}}{{println}}{{end}}'
ls /opt/opencenter/debs /opt/opencenter/pypi

Restart the container after fixing the mount source:

docker rm -f opencenter-nginx
opencenter-airgap serve dist/zarf-package-*.tar.zst

Ansible: UNREACHABLE! …​ ssh

# Confirm SSH works directly.
ssh -i ~/.ssh/id_rsa deployer@node1

# Add user and key file to the inventory.
cat <<EOF >> /opt/opencenter/kubespray/inventory/mycluster/inventory.yml
all:
  vars:
    ansible_user: deployer
    ansible_ssh_private_key_file: ~/.ssh/id_rsa
EOF

If this is a new host, add ANSIBLE_HOST_KEY_CHECKING=False for the first run only.

E: Unable to locate package kubelet on cluster nodes

The offline-repo playbook did not run, so apt is still pointing at the public Ubuntu mirrors.

cd /opt/opencenter/playbook
ansible-playbook -i ../kubespray/inventory/mycluster/inventory.yml offline-repo.yml

# Confirm on the node:
ssh node1 "cat /etc/apt/sources.list.d/opencenter.list"
ssh node1 "apt-cache policy kubelet"

Pods stuck in ImagePullBackOff

Containerd is not pointed at the bastion registry.

ssh node1 "grep -A2 'registry.mirrors' /etc/containerd/config.toml"
ssh node1 "curl -s http://${BASTION_IP}:5000/v2/_catalog | jq '.repositories | length'"
ssh node1 "sudo systemctl restart containerd"

If the registry is reachable but the specific image is missing, that image was not collected during the build. Re-run opencenter-airgap scan --repos and rebuild.

FluxCD Kustomization shows reconciliation failed

kubectl logs -n flux-system deploy/source-controller
kubectl logs -n flux-system deploy/kustomize-controller
flux reconcile kustomization flux-system --with-source

The most common cause is Gitea not being reachable from the cluster nodes. Check kubectl get svc -n gitea and confirm DNS or the Service ClusterIP from a node.

Verification phase

checksum verification failed

The file is corrupted or has been modified. Re-copy the package and the .sha256 sidecar from dist/ on the build host.

signature verification failed

The package was re-signed, the wrong public key is being used, or the file was modified.

# Confirm the public key matches the one used to sign:
cosign public-key --key .secrets/signing-key.key | diff - .secrets/signing-key.pub

If you are operating the system, regenerate keys (opencenter-airgap keygen --force) and rebuild. If you are receiving a third-party package, contact the publisher.

N image(s) use mutable or missing tags

The SBOM contains latest or untagged image references. Find them in the SBOM:

jq -r '.artifacts[]
        | select(.type == "image")
        | select(.version == "latest" or .version == null)
        | .name' \
   dist/zarf-package-*-sbom.json

For each entry, pin the version in config/components.yaml (or the upstream Helm chart values) and rebuild.

N HIGH/CRITICAL vulnerabilities reported

The SBOM scanner flagged CVEs. Either patch to a fixed version of the underlying image, or document and accept the risk in your security policy.

jq -r '.vulnerabilities[]
        | select(.severity == "CRITICAL" or .severity == "HIGH")
        | "\(.severity) \(.id) \(.affects // "")"' \
   dist/zarf-package-*-sbom.json

Diagnostics

Useful commands when filing an issue:

opencenter-airgap version
opencenter-airgap status
sha256sum config/versions.env config/components.yaml
tail -50 build/build-*.log

Include those four outputs plus a description of what you ran. See ../../CONTRIBUTING.md[CONTRIBUTING.md].

  • ../operations/resume-failed-build.md[Resume a Failed Build].

  • ../operations/verify-package.md[Verify a Built Package].

  • build-steps.md[Build Steps] — failure modes are repeated per step in this reference.