openova/platform/powerdns
e3mrah 726af6df81
fix(bp-powerdns): self-generate api-credentials Secret + disable upstream zone-bootstrap Job (#248)
Root cause investigation on otech.omani.works (kubectl, sanitized):

  $ kubectl get pods -n powerdns
  create-zone-if-not-exist-sh-tjtr4   0/1  CreateContainerConfigError  4h
  powerdns-57d7d49f99-{9hrb4,lxlgt,nkmht}  0/1  CreateContainerConfigError  4h
  dnsdist-594dbfc5f-wznsw                  1/1  Running  4h

  $ kubectl get secrets -n powerdns
  powerdns                Opaque  1  4h
  powerdns-api-tls-8kxpx  Opaque  1  4h     (NO `powerdns-api-credentials`, NO `pdns-pg-app`)

  $ kubectl describe pod ... powerdns-57d7d49f99-9hrb4
  Environment:
    PDNS_API_KEY:  <set to the key 'api-key' in secret 'powerdns-api-credentials'>  Optional: false
    PDNS_DB_HOST:  <set to the key 'host' in secret 'pdns-pg-app'>                  Optional: false
    State: Waiting   Reason: CreateContainerConfigError

The handover's chicken-egg-with-secret theory was directionally right but
the cause was more fundamental:

  1. Wrapper chart's api-credentials-secret.yaml (1.1.2) was a no-op
     unless operator set `apiKey` value out-of-band — comment said the
     deployment would "fail to start until the named Secret exists" as
     "the explicit signal we want". On a Sovereign that bootstraps from
     bp-* OCI artifacts, no operator is standing by, so the Secret is
     never created and pods sit in CreateContainerConfigError forever.

  2. The upstream chart's `create-zone-if-not-exists-sh` Job is rendered
     whenever both `zoneName` and `api.key` are set — defaulting
     `zoneName: "example.de."` it ALWAYS rendered and ALWAYS failed
     (same missing Secret). Catalyst doesn't want this Job at all
     because zones are loaded later by pool-domain-manager (PDM).

  3. The chart's CNPG Cluster template is gated behind
     Capabilities.APIVersions.Has "postgresql.cnpg.io/v1" — on a fresh
     Sovereign without bp-cnpg yet (bp-cnpg is on the roadmap, not in
     bootstrap-kit), no Cluster is rendered and `pdns-pg-app` Secret
     never materialises. With Helm `--wait`, install times out
     ("context deadline exceeded") even though the manifests applied
     cleanly.

Fix:

  * api-credentials-secret.yaml: self-generate via Helm `lookup` +
    `randAlphaNum 32`. First install creates fresh randoms; every
    subsequent reconcile reads back the existing values from the
    Secret so the API key never rotates on upgrade. Operator can
    still pin specific values via .Values.powerdns.apiKey /
    .Values.powerdns.webserverPassword, or skip Secret creation
    entirely via .Values.powerdns.useExistingApiSecret. Same pattern
    as bitnami/postgresql, bitnami/keycloak.

  * values.yaml: set `powerdns.zoneName: ""` so upstream chart's
    `{{- if and .Values.powerdns.zoneName .Values.powerdns.api.key }}`
    gate skips the create-zone Job entirely. Catalyst's PDM creates
    zones via the REST API after the cluster comes up; we don't want
    a placeholder `example.de.` zone in production.

  * HelmRelease (both _template and otech.omani.works overlays):
    `install.disableWait: true` + `upgrade.disableWait: true` so the
    HelmRelease reports Ready as soon as manifests apply cleanly,
    rather than gating on powerdns Deployment readiness which depends
    on bp-cnpg landing first to synthesise `pdns-pg-app`. Runtime
    convergence is observed via kubectl, not gated on Helm.

Live error this addresses:
  Helm upgrade failed for release powerdns/powerdns with chart
  bp-powerdns@1.1.2: context deadline exceeded

Verified locally with `helm template`:
  - powerdns-api-credentials Secret renders with random api-key + webserver-password
  - create-zone-if-not-exist-sh Job no longer rendered
  - Deployment env continues to reference powerdns-api-credentials correctly

Bumped 1.1.2 -> 1.1.3 (chart, blueprint, both bootstrap-kit overlays).

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 16:55:12 +04:00
..
chart fix(bp-powerdns): self-generate api-credentials Secret + disable upstream zone-bootstrap Job (#248) 2026-04-30 16:55:12 +04:00
blueprint.yaml fix(bp-powerdns): self-generate api-credentials Secret + disable upstream zone-bootstrap Job (#248) 2026-04-30 16:55:12 +04:00