openova/platform/external-dns
e3mrah 31784d7ed5
fix(bp-external-dns): apiserver Endpoints sync timeout — Cilium kube-apiserver entity required (closes #770) (#771)
* fix(bp-external-dns): grant apiserver egress via CiliumNetworkPolicy (closes #770)

Root cause: ExternalDNS crashloops on every fresh Sovereign provision
with `failed to sync *v1.Endpoints: context deadline exceeded`. The
companion vanilla NetworkPolicy egress rule
`to: ipBlock: 0.0.0.0/0 ports: 443,6443` does NOT match traffic to the
kube-apiserver under Cilium with the default `policy-cidr-match-mode: ""`.
Cilium models the apiserver as a reserved identity, not a CIDR range,
so the ipBlock rule is bypassed and the apiserver call is dropped at
the egress hook of the external-dns endpoint.

Fix: render a companion CiliumNetworkPolicy with
`toEntities: [kube-apiserver]` scoped to the external-dns Pod selector.
This is the canonical Cilium pattern for controllers that watch the
apiserver. The existing vanilla NetworkPolicy is preserved verbatim so
the Blueprint remains CNI-agnostic per BLUEPRINT-AUTHORING.md.

Live proof on otech93 (2026-05-04): manually applied the rendered CNP
to the running cluster, external-dns transitioned from CrashLoopBackOff
(8 restarts in 20m) to 1/1 Running within 30s, informer cache sync
completed cleanly.

Bumps bp-external-dns 1.1.6 → 1.1.7.

Why not `policy-cidr-match-mode: nodes` cluster-wide on bp-cilium? It
silently relaxes EVERY other NetworkPolicy that uses 0.0.0.0/0 in the
cluster — too broad. Per INVIOLABLE-PRINCIPLES the fix MUST be scoped
to the workload that needs it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(_template): bump bp-external-dns 1.1.6 → 1.1.7 to pick up CNP fix

Pairs with the chart bump in the same PR. Every fresh otech provision
hydrates clusters/_template/, so this pin is what determines the
version installed. Without bumping here, otech94+ would still use
1.1.6 and continue to crashloop with the apiserver-egress symptom.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: hatiyildiz <hatice@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 19:27:17 +04:00
..
chart fix(bp-external-dns): apiserver Endpoints sync timeout — Cilium kube-apiserver entity required (closes #770) (#771) 2026-05-04 19:27:17 +04:00
policies feat(external-dns): #109 — Catalyst-curated dynadot-multi-domain policy 2026-04-28 14:45:53 +02:00
README.md refactor(platform): remove k8gb — replaced by PowerDNS lua-records (#171) 2026-04-29 08:51:09 +02:00

ExternalDNS

DNS synchronization (registers/deletes records via the PowerDNS REST API and external cloud DNS APIs where applicable). Per-host-cluster infrastructure (see docs/PLATFORM-TECH-STACK.md §3.1) — runs on every host cluster, primarily on the DMZ block. PowerDNS (see docs/PLATFORM-POWERDNS.md) is the authoritative server for every Sovereign zone; ExternalDNS uses the webhook provider (external-dns-pdns) to write A/AAAA/CNAME records into PowerDNS. Health-checked geo-failover lives in PowerDNS lua-records — see docs/MULTI-REGION-DNS.md.

Status: Accepted | Updated: 2026-04-27


Overview

ExternalDNS synchronizes Kubernetes resources (Gateway, Service, Ingress) with external DNS providers, enabling automatic DNS record management.


Architecture

flowchart TB
    subgraph K8s["Kubernetes"]
        GW[Gateway API]
        Svc[Services]
        ExtDNS[ExternalDNS]
    end

    subgraph DNS["DNS Providers"]
        PDNS[PowerDNS<br>(authoritative — every Sovereign zone)]
        CF[Cloudflare]
        R53[Route53]
        HDNS[Hetzner DNS]
    end

    GW --> ExtDNS
    Svc --> ExtDNS
    ExtDNS --> PDNS
    ExtDNS --> CF
    ExtDNS --> R53
    ExtDNS --> HDNS

Supported DNS Providers

Provider Availability
Cloudflare Always
Hetzner DNS If Hetzner chosen
AWS Route53 If AWS chosen
GCP Cloud DNS If GCP chosen
Azure DNS If Azure chosen

Configuration

ExternalDNS Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: external-dns
  namespace: external-dns
spec:
  template:
    spec:
      containers:
        - name: external-dns
          image: registry.k8s.io/external-dns/external-dns:v0.14.0
          args:
            - --source=gateway-httproute
            - --source=gateway-grpcroute
            - --source=service
            - --provider=cloudflare
            - --cloudflare-proxied
            - --txt-owner-id=openova
            - --txt-prefix=_externaldns.
          env:
            - name: CF_API_TOKEN
              valueFrom:
                secretKeyRef:
                  name: cloudflare-credentials
                  key: api-token

PowerDNS Integration (geo + health-checked failover)

ExternalDNS writes plain A/AAAA/CNAME records into PowerDNS via the REST API. Geo-aware and health-checked failover responses are owned by PowerDNS lua-records, written by the catalyst-dns controller — ExternalDNS does NOT manage lua-record content.

flowchart LR
    subgraph Region1["Region 1"]
        App1[Application]
        ExtDNS1[ExternalDNS]
    end

    subgraph Region2["Region 2"]
        App2[Application]
        ExtDNS2[ExternalDNS]
    end

    subgraph PDNS["PowerDNS Authoritative"]
        ZoneAPI[REST API]
        Lua[lua-records (ifurlup, pickclosest)]
    end

    ExtDNS1 -->|"plain A/AAAA"| ZoneAPI
    ExtDNS2 -->|"plain A/AAAA"| ZoneAPI
    ZoneAPI --- Lua

See docs/MULTI-REGION-DNS.md for the lua-record patterns.


Record Types

Source Record Type Example
Gateway A/CNAME api.<domain>
Service (LoadBalancer) A svc.<domain>
catalyst-dns (lua-record author) LUA A app.<domain> (geo + health-checked)

TXT Registry

ExternalDNS uses TXT records to track ownership:

_externaldns.api.<domain> TXT "heritage=external-dns,external-dns/owner=openova"

This prevents ExternalDNS from modifying records it doesn't own.


Part of OpenOva