openova/platform/external-dns
e3mrah c5ffaa2fd7
fix(bp-external-dns): livenessProbe.initialDelaySeconds=180 for cold-cluster cache-sync (closes #700) (#707)
PR #679 added --request-timeout=120s but external-dns has TWO timeouts:
RequestTimeout (per-API-call, controlled by --request-timeout) and
WaitForCacheSync (initial informer sync, hardcoded 60s in upstream binary,
NOT exposed as a flag). On a fresh Sovereign with k3s apiserver
CPU-saturated, the cache sync misses 60s -> fatal: failed to sync
*v1.Node: context deadline exceeded -> CrashLoopBackOff 5-10 times.
Caught live on otech49+ (2026-05-03), 5 restarts before stable.

Bump livenessProbe.initialDelaySeconds from upstream 10s default to 180s
so kubelet does NOT restart the Pod while the initial cache sync runs
against a CPU-saturated freshly-provisioned k3s apiserver. The Sovereign
apiserver reaches steady-state within ~2 min so 3 min comfortably covers
cold starts. Also bumps periodSeconds=30 + failureThreshold=3 so a
genuinely-hung pod is still killed within ~90s once steady-state.
readinessProbe gets a corresponding initialDelaySeconds=30 so endpoint
flapping during sync doesn't churn services.

Helm overrides REPLACE whole maps (not merge), so the override preserves
the upstream httpGet.path: /healthz + port: http shape verbatim.

Bumps:
- platform/external-dns/chart/Chart.yaml: 1.1.5 -> 1.1.6
- clusters/_template/bootstrap-kit/12-external-dns.yaml: HelmRelease pin 1.1.5 -> 1.1.6

Co-authored-by: hatiyildiz <hatice@openova.io>
2026-05-03 23:39:36 +04:00
..
chart fix(bp-external-dns): livenessProbe.initialDelaySeconds=180 for cold-cluster cache-sync (closes #700) (#707) 2026-05-03 23:39:36 +04:00
policies feat(external-dns): #109 — Catalyst-curated dynadot-multi-domain policy 2026-04-28 14:45:53 +02:00
README.md refactor(platform): remove k8gb — replaced by PowerDNS lua-records (#171) 2026-04-29 08:51:09 +02:00

ExternalDNS

DNS synchronization (registers/deletes records via the PowerDNS REST API and external cloud DNS APIs where applicable). Per-host-cluster infrastructure (see docs/PLATFORM-TECH-STACK.md §3.1) — runs on every host cluster, primarily on the DMZ block. PowerDNS (see docs/PLATFORM-POWERDNS.md) is the authoritative server for every Sovereign zone; ExternalDNS uses the webhook provider (external-dns-pdns) to write A/AAAA/CNAME records into PowerDNS. Health-checked geo-failover lives in PowerDNS lua-records — see docs/MULTI-REGION-DNS.md.

Status: Accepted | Updated: 2026-04-27


Overview

ExternalDNS synchronizes Kubernetes resources (Gateway, Service, Ingress) with external DNS providers, enabling automatic DNS record management.


Architecture

flowchart TB
    subgraph K8s["Kubernetes"]
        GW[Gateway API]
        Svc[Services]
        ExtDNS[ExternalDNS]
    end

    subgraph DNS["DNS Providers"]
        PDNS[PowerDNS<br>(authoritative — every Sovereign zone)]
        CF[Cloudflare]
        R53[Route53]
        HDNS[Hetzner DNS]
    end

    GW --> ExtDNS
    Svc --> ExtDNS
    ExtDNS --> PDNS
    ExtDNS --> CF
    ExtDNS --> R53
    ExtDNS --> HDNS

Supported DNS Providers

Provider Availability
Cloudflare Always
Hetzner DNS If Hetzner chosen
AWS Route53 If AWS chosen
GCP Cloud DNS If GCP chosen
Azure DNS If Azure chosen

Configuration

ExternalDNS Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: external-dns
  namespace: external-dns
spec:
  template:
    spec:
      containers:
        - name: external-dns
          image: registry.k8s.io/external-dns/external-dns:v0.14.0
          args:
            - --source=gateway-httproute
            - --source=gateway-grpcroute
            - --source=service
            - --provider=cloudflare
            - --cloudflare-proxied
            - --txt-owner-id=openova
            - --txt-prefix=_externaldns.
          env:
            - name: CF_API_TOKEN
              valueFrom:
                secretKeyRef:
                  name: cloudflare-credentials
                  key: api-token

PowerDNS Integration (geo + health-checked failover)

ExternalDNS writes plain A/AAAA/CNAME records into PowerDNS via the REST API. Geo-aware and health-checked failover responses are owned by PowerDNS lua-records, written by the catalyst-dns controller — ExternalDNS does NOT manage lua-record content.

flowchart LR
    subgraph Region1["Region 1"]
        App1[Application]
        ExtDNS1[ExternalDNS]
    end

    subgraph Region2["Region 2"]
        App2[Application]
        ExtDNS2[ExternalDNS]
    end

    subgraph PDNS["PowerDNS Authoritative"]
        ZoneAPI[REST API]
        Lua[lua-records (ifurlup, pickclosest)]
    end

    ExtDNS1 -->|"plain A/AAAA"| ZoneAPI
    ExtDNS2 -->|"plain A/AAAA"| ZoneAPI
    ZoneAPI --- Lua

See docs/MULTI-REGION-DNS.md for the lua-record patterns.


Record Types

Source Record Type Example
Gateway A/CNAME api.<domain>
Service (LoadBalancer) A svc.<domain>
catalyst-dns (lua-record author) LUA A app.<domain> (geo + health-checked)

TXT Registry

ExternalDNS uses TXT records to track ownership:

_externaldns.api.<domain> TXT "heritage=external-dns,external-dns/owner=openova"

This prevents ExternalDNS from modifying records it doesn't own.


Part of OpenOva