Root cause investigation on otech.omani.works (kubectl, sanitized):
$ kubectl get pods -n powerdns
create-zone-if-not-exist-sh-tjtr4 0/1 CreateContainerConfigError 4h
powerdns-57d7d49f99-{9hrb4,lxlgt,nkmht} 0/1 CreateContainerConfigError 4h
dnsdist-594dbfc5f-wznsw 1/1 Running 4h
$ kubectl get secrets -n powerdns
powerdns Opaque 1 4h
powerdns-api-tls-8kxpx Opaque 1 4h (NO `powerdns-api-credentials`, NO `pdns-pg-app`)
$ kubectl describe pod ... powerdns-57d7d49f99-9hrb4
Environment:
PDNS_API_KEY: <set to the key 'api-key' in secret 'powerdns-api-credentials'> Optional: false
PDNS_DB_HOST: <set to the key 'host' in secret 'pdns-pg-app'> Optional: false
State: Waiting Reason: CreateContainerConfigError
The handover's chicken-egg-with-secret theory was directionally right but
the cause was more fundamental:
1. Wrapper chart's api-credentials-secret.yaml (1.1.2) was a no-op
unless operator set `apiKey` value out-of-band — comment said the
deployment would "fail to start until the named Secret exists" as
"the explicit signal we want". On a Sovereign that bootstraps from
bp-* OCI artifacts, no operator is standing by, so the Secret is
never created and pods sit in CreateContainerConfigError forever.
2. The upstream chart's `create-zone-if-not-exists-sh` Job is rendered
whenever both `zoneName` and `api.key` are set — defaulting
`zoneName: "example.de."` it ALWAYS rendered and ALWAYS failed
(same missing Secret). Catalyst doesn't want this Job at all
because zones are loaded later by pool-domain-manager (PDM).
3. The chart's CNPG Cluster template is gated behind
Capabilities.APIVersions.Has "postgresql.cnpg.io/v1" — on a fresh
Sovereign without bp-cnpg yet (bp-cnpg is on the roadmap, not in
bootstrap-kit), no Cluster is rendered and `pdns-pg-app` Secret
never materialises. With Helm `--wait`, install times out
("context deadline exceeded") even though the manifests applied
cleanly.
Fix:
* api-credentials-secret.yaml: self-generate via Helm `lookup` +
`randAlphaNum 32`. First install creates fresh randoms; every
subsequent reconcile reads back the existing values from the
Secret so the API key never rotates on upgrade. Operator can
still pin specific values via .Values.powerdns.apiKey /
.Values.powerdns.webserverPassword, or skip Secret creation
entirely via .Values.powerdns.useExistingApiSecret. Same pattern
as bitnami/postgresql, bitnami/keycloak.
* values.yaml: set `powerdns.zoneName: ""` so upstream chart's
`{{- if and .Values.powerdns.zoneName .Values.powerdns.api.key }}`
gate skips the create-zone Job entirely. Catalyst's PDM creates
zones via the REST API after the cluster comes up; we don't want
a placeholder `example.de.` zone in production.
* HelmRelease (both _template and otech.omani.works overlays):
`install.disableWait: true` + `upgrade.disableWait: true` so the
HelmRelease reports Ready as soon as manifests apply cleanly,
rather than gating on powerdns Deployment readiness which depends
on bp-cnpg landing first to synthesise `pdns-pg-app`. Runtime
convergence is observed via kubectl, not gated on Helm.
Live error this addresses:
Helm upgrade failed for release powerdns/powerdns with chart
bp-powerdns@1.1.2: context deadline exceeded
Verified locally with `helm template`:
- powerdns-api-credentials Secret renders with random api-key + webserver-password
- create-zone-if-not-exist-sh Job no longer rendered
- Deployment env continues to reference powerdns-api-credentials correctly
Bumped 1.1.2 -> 1.1.3 (chart, blueprint, both bootstrap-kit overlays).
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>