* fix(bp-cert-manager-powerdns-webhook): re-target to contabo PowerDNS, drop dynadot-webhook Caught live on otech43-46: cert-manager DNS-01 challenges for *.otechN.omani.works failed because the Sovereign-side webhook wrote challenge TXT records to the Sovereign's local PowerDNS. omani.works is delegated from Dynadot to ns1/2/3.openova.io which run on contabo's central PowerDNS — the Sovereign's local PowerDNS is INVISIBLE on the public DNS chain until pool-domain-manager seals the per-Sovereign NS delegation. Let's Encrypt resolvers walk the public chain, query contabo, get NXDOMAIN, the cert never issues. Manual workaround was seeding challenge TXT directly in contabo PowerDNS. This PR automates the right write path: - bp-cert-manager-powerdns-webhook chart bumped to 1.0.4. Default powerdns.host flips from "" (skip-render) to https://pdns.openova.io (contabo's public PowerDNS API ingress, authoritative for omani.works). - ClusterIssuer letsencrypt-dns01-prod-powerdns now usable with no per-cluster powerdns.host override for the omani.works pool. apiKeySecretRef.namespace clarified — upstream ignores it; the Secret must live in cert-manager namespace (= ChallengeRequest.ResourceNamespace for ClusterIssuers). - bootstrap-kit slot 49 updated: drops bp-powerdns dependsOn (webhook calls out-of-cluster contabo, not local PowerDNS), bumps chart version, removes inline powerdns.host override (defaults are correct). - bootstrap-kit slot 49b (bp-cert-manager-dynadot-webhook) DELETED entirely — Dynadot is NOT the API-level authority for omani.works subdomains, the dynadot webhook silently fails the same way the Sovereign-local powerdns one did. - clusters/_template/sovereign-tls/cilium-gateway-cert.yaml flips issuerRef from letsencrypt-dns01-prod (was dynadot-backed) to letsencrypt-dns01-prod-powerdns (the new contabo-backed issuer). - bp-cert-manager chart: certManager.issuers.dns01.enabled defaults to false (deprecated dynadot path). letsencrypt-http01-prod retained for per-host certs. Cluster overlays MAY flip dns01.enabled=true for non-omani.works pools where Dynadot IS the API-level authority. - scripts/expected-bootstrap-deps.yaml: drops slot 49b, drops bp-powerdns edge from slot 49. - Documentation (README + blueprint.yaml + Chart.yaml description) rewritten to reflect contabo retarget and lifecycle reasoning. Credential plumbing (out of scope here, must be done in cloud-init): - Every Sovereign needs a `powerdns-api-credentials` Secret in the `cert-manager` namespace whose `api-key` value matches contabo's PowerDNS API key. Same seeding pattern as `dynadot-api-credentials` in infra/hetzner/cloudinit-control-plane.tftpl. Caveat — basicAuth on contabo's PowerDNS API ingress: contabo currently fronts pdns.openova.io with Traefik basicAuth (per clusters/contabo-mkt/apps/powerdns/helmrelease.yaml). The upstream zachomedia/cert-manager-webhook-pdns binary supports the X-API-Key header but not HTTP Basic Auth out of the box. To make this end-to-end green, contabo's basicAuth requirement must be relaxed (X-API-Key alone provides the auth posture, and contabo's API endpoint is restricted to operator IPs by other means OR the Sovereign's webhook needs an Authorization header injected via the chart's powerdns.headers map (plaintext password in the ClusterIssuer config — not ideal). This PR ships the chart side; the basicAuth question is a follow-up on the contabo side. Verified locally: - helm lint platform/cert-manager-powerdns-webhook/chart -> PASS - helm template platform/cert-manager-powerdns-webhook/chart -> renders - helm template ... --set clusterIssuer.enabled=true -> renders the ClusterIssuer with host="https://pdns.openova.io" + correct apiKey Secret reference. - helm template platform/cert-manager/chart -> renders ONLY letsencrypt-http01-prod (the dns01 dynadot issuer correctly gated off). - scripts/check-bootstrap-deps.sh: net-zero new drift; my branch reduces pre-existing errors from 3 to 2 (the dropped slot 49b removed the only drift my branch was responsible for). Closes follow-up to #373. Preconditions for handover URL TLS green on otech43-46 lineage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(scripts): repair YAML structure in expected-bootstrap-deps.yaml Two pre-existing drifts were blocking dependency-graph-audit CI: 1. Slot 5a (bp-reflector) was missing its closing list separator, causing yq to merge the bp-nats-jetstream entry into the bp-reflector map and effectively drop bp-reflector from the expected DAG. Added explicit `- slot: 7` for bp-nats-jetstream and quoted "5a" so yq treats it as a string slot (matches the convention with "49b"). 2. bp-powerdns slot 11: actual bootstrap-kit declares dependsOn bp-cnpg (live since otech28 — pdns-pg-app secret race) but the expected DAG was missing this edge. This is unblocks merging fix/cert-manager-powerdns-webhook-contabo (PR above) — these drifts existed on main but weren't surfaced until the last expected-deps edit forced a re-run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatiyildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| chart | ||
| blueprint.yaml | ||
| README.md | ||
bp-cert-manager-powerdns-webhook
Catalyst Blueprint for the cert-manager DNS-01 external webhook for PowerDNS. Closes openova#373.
What it is
A wrapper around the upstream
zachomedia/cert-manager-webhook-pdns
binary that satisfies cert-manager's external webhook contract
(webhook.acme.cert-manager.io/v1alpha1 — Present / CleanUp on a
ChallengeRequest) and writes ACME challenge TXT records to the
central openova PowerDNS (authoritative for omani.works) via
PowerDNS's REST API at https://pdns.openova.io.
This blueprint supersedes bp-cert-manager-dynadot-webhook for
omani.works Sovereigns: omani.works is registered at Dynadot but is
delegated to ns1/2/3.openova.io which run on contabo's PowerDNS.
Dynadot is NOT the API-level authority for omani.works subdomains;
contabo PowerDNS is. Caught live on otech43–46 where the dynadot
webhook silently failed to write challenge TXT records visible on the
public DNS chain.
How DNS-01 validation works for *.${SOVEREIGN_FQDN}
When Let's Encrypt validates a DNS-01 challenge for
*.otechN.omani.works, its resolvers walk the public DNS chain:
Dynadot → ns1/2/3.openova.io (contabo PowerDNS). Until pool-domain-
manager has committed the per-Sovereign NS delegation into contabo
PowerDNS — and that delegation has propagated — the Sovereign's own
PowerDNS is INVISIBLE on the public chain.
This webhook writes the ACME challenge TXT record DIRECTLY to contabo's central PowerDNS, so Let's Encrypt validation succeeds on the first attempt regardless of whether the Sovereign-side delegation has sealed.
What this chart deploys
| Resource | Purpose |
|---|---|
| Deployment | Runs the upstream zachomedia/cert-manager-webhook-pdns image as a non-root pod. |
| Service | ClusterIP fronting the Deployment on port 443. |
| APIService | Registers v1alpha1.acme.powerdns.openova.io so the kube-apiserver routes ChallengeRequest calls to the Service. |
| Issuer (selfsigned) | Bootstraps the CA chain that issues the webhook's serving cert. |
| Issuer (CA) | Signs the leaf serving cert from the CA Secret. |
| Certificate (CA) | Root CA cert used by the APIService's cert-manager.io/inject-ca-from annotation. |
| Certificate (serving) | Leaf cert mounted into the Deployment at /tls. |
| ServiceAccount | Identity for the Deployment. |
| ClusterRoleBinding (auth-delegator) | Lets the aggregated apiserver delegate auth back to kube-apiserver. |
| RoleBinding (auth-reader) | Reads extension-apiserver-authentication ConfigMap from kube-system. |
| ClusterRole + ClusterRoleBinding (secret-reader) | Grants the SA get on Secrets cluster-wide so it can read the PowerDNS API-key Secret on demand. |
| ClusterRole + ClusterRoleBinding (domain-solver) | Lets cert-manager create ChallengeRequest CRs in the webhook's API group. |
ClusterIssuer (letsencrypt-dns01-prod-powerdns) |
Paired DNS-01 issuer. Renders when clusterIssuer.enabled=true (chart's default powerdns.host=https://pdns.openova.io is sufficient for the omani.works pool; cluster overlays may override the host for non-omani.works pools). |
Pairing with bp-cert-manager
The blueprint declares bp-cert-manager as depends: in blueprint.yaml
(provides the cert-manager controllers + CRDs). It does NOT depend on
bp-powerdns — the webhook calls contabo's central PowerDNS, an
out-of-cluster endpoint, not the Sovereign's local PowerDNS.
Flux dependsOn enforces ordering at the HelmRelease level (see
clusters/_template/bootstrap-kit/49-bp-cert-manager-powerdns-webhook.yaml).
Configuration (per-Sovereign overlay)
The chart's defaults render a runnable webhook + skip the ClusterIssuer
(default clusterIssuer.enabled=false for safe CI smoke renders).
Sovereign overlays flip clusterIssuer.enabled=true and set the email:
clusterIssuer:
enabled: true
email: ops@<sovereign-fqdn>
acmeServer: https://acme-v02.api.letsencrypt.org/directory # or staging during bring-up
# `powerdns.host` defaults to https://pdns.openova.io (contabo central
# PowerDNS, authoritative for omani.works). Override only when
# provisioning a Sovereign in a non-omani.works pool.
# powerdns:
# host: "https://pdns.<other-pool>"
The credential Secret powerdns-api-credentials MUST live in the
cert-manager namespace on every Sovereign (the upstream webhook
ignores any namespace: field on the apiKeySecretRef and reads the
Secret from cert-manager's cluster-resource-namespace). The Secret's
api-key value MUST match the API key configured on contabo's central
PowerDNS — provisioned by cloud-init at control-plane boot time
(infra/hetzner/cloudinit-control-plane.tftpl).
Per docs/INVIOLABLE-PRINCIPLES.md #4 every URL/zone is operator-
overridable. No hardcoded omantel.omani.works lives in this chart.
Smoke test
Once both charts (bp-cert-manager + bp-cert-manager-powerdns-webhook) are reconciled on a Sovereign:
# Verify the webhook is running and the APIService is healthy
kubectl get -n cert-manager deploy/release-name-bp-cert-manager-powerdns-webhook
kubectl get apiservices.apiregistration.k8s.io v1alpha1.acme.powerdns.openova.io
# Issue a wildcard cert against the Sovereign apex
cat <<EOF | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: wildcard-omantel-omani-works
namespace: kube-system
spec:
secretName: wildcard-omantel-omani-works-tls
issuerRef:
name: letsencrypt-dns01-prod-powerdns
kind: ClusterIssuer
dnsNames:
- "*.omantel.omani.works"
EOF
# Watch the Order + Challenge progress
kubectl get certificate,order,challenge -A -w
See also
- Upstream: https://github.com/zachomedia/cert-manager-webhook-pdns
platform/cert-manager/chart/templates/clusterissuer-letsencrypt-dns01.yaml— legacyletsencrypt-dns01-prod(now default-disabled; was dynadot-backed)platform/powerdns/— the per-Sovereign DNS authority for app-level records (NOT in the cert-issuance path)- openova#373 — closing issue
- cert-manager DNS-01 webhook docs