* fix(cilium-gateway): listener ports 80/443 → 30080/30443 + LB retarget cilium-envoy refuses to bind privileged ports (80/443) on Sovereigns even with all of: - gatewayAPI.hostNetwork.enabled=true on the Cilium chart - securityContext.privileged=true on the cilium-envoy DaemonSet - securityContext.capabilities.add=[NET_BIND_SERVICE] - envoy-keep-cap-netbindservice=true in cilium-config ConfigMap - Gateway API CRDs at v1.3.0 (matching cilium 1.19.3 schema) Repeatable error from cilium-envoy logs across otech45, otech46, otech47: listener 'kube-system/cilium-gateway-cilium-gateway/listener' failed to bind or apply socket options: cannot bind '0.0.0.0:80': Permission denied The bind() syscall is intercepted by cilium-agent's BPF socket-LB program in a way that does not honour container capabilities. Even PID 1 with CapEff=0x000001ffffffffff (all caps) and uid=0 gets "Permission denied". Cilium 1.19.3 → 1.16.5 made no difference (F1, PR #684 still ships — the version bump is sound for other reasons; the listener bind is just a separate fix). This commit moves the listeners to high ports (30080/30443) and lets the Hetzner LB do the public-facing port translation: HCLB :80 → CP node :30080 (cilium-gateway HTTP listener) HCLB :443 → CP node :30443 (cilium-gateway HTTPS listener) External users still hit `https://console.<sov>.omani.works/auth/handover` on port 443; the high port is invisible. High-port bind succeeds without NET_BIND_SERVICE because the kernel only gates ports below `net.ipv4.ip_unprivileged_port_start` (default 1024). Will be verified on otech48: the next fresh provision should serve console.otech48/auth/handover end-to-end without the 502/timeout chain seen on otech45–47. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(powerdns+catalyst-api): zero-touch contabo PowerDNS API key for Sovereign cert-manager PR #681 followup. The new bp-cert-manager-powerdns-webhook (PR #681) calls contabo's authoritative PowerDNS at pdns.openova.io to write DNS-01 challenge TXT records for *.otech<N>.omani.works. That webhook needs an X-API-Key Secret in the Sovereign's cert-manager namespace — PR #681 didn't ship the materialization seam, so on otech43..otech47 the Secret was missing and the wildcard cert never issued. This commit closes the seam from contabo to the Sovereign: 1. bp-powerdns chart 1.1.7 to 1.1.8: Reflector annotations on openova-system/powerdns-api-credentials extended from "external-dns" to "external-dns,catalyst" so contabo catalyst-api can mount the API key. 2. bp-powerdns: api.basicAuth.enabled flips default true to false. Layered Traefik basicAuth + PowerDNS X-API-Key was double auth that blocked machine-to-machine API access from Sovereigns. The X-API-Key contract is unchanged. 3. bp-catalyst-platform 1.2.3 to 1.2.4: api-deployment.yaml adds CATALYST_POWERDNS_API_KEY env from powerdns-api-credentials/api-key secret (optional=true so Sovereign-side catalyst-api Pods that don't reflect this still start clean). 4. catalyst-api provisioner.go: new Provisioner.PowerDNSAPIKey field reads from CATALYST_POWERDNS_API_KEY env at New(). Stamps onto every Request before Validate(). Forwards as tofu var powerdns_api_key. 5. infra/hetzner/variables.tf: new var.powerdns_api_key (sensitive, default ""). 6. infra/hetzner/cloudinit-control-plane.tftpl: replaces the defunct dynadot-api-credentials Secret block (PR #681 dropped bp-cert-manager-dynadot-webhook) with a new cert-manager/powerdns-api-credentials Secret block. runcmd applies it BEFORE Flux reconciles bp-cert-manager-powerdns-webhook. End-to-end seam mirrors PR #543 ghcr-pull and PR #680 harbor-robot-token. Will be verified live on otech48 (next provision after this lands). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
67 lines
2.6 KiB
YAML
67 lines
2.6 KiB
YAML
apiVersion: catalyst.openova.io/v1alpha1
|
|
kind: Blueprint
|
|
metadata:
|
|
name: bp-powerdns
|
|
labels:
|
|
catalyst.openova.io/category: per-host-cluster-infrastructure
|
|
catalyst.openova.io/section: pts-3-2-gitops-and-iac
|
|
spec:
|
|
version: 1.1.8
|
|
card:
|
|
title: PowerDNS
|
|
summary: |
|
|
Authoritative DNS for every Sovereign zone (pool + BYO). Per-zone
|
|
DNSSEC (ECDSAP256SHA256), lua-records for geo + health-checked
|
|
failover, dnsdist front-end for query rate-limiting + DDoS posture,
|
|
REST API at pdns.openova.io/api (operator-only). See
|
|
docs/MULTI-REGION-DNS.md for the lua-record patterns.
|
|
icon: powerdns.svg
|
|
category: infrastructure
|
|
visibility: unlisted # mandatory infra, auto-installed by bootstrap kit
|
|
configSchema:
|
|
type: object
|
|
properties:
|
|
replicaCount:
|
|
type: integer
|
|
default: 3
|
|
description: PowerDNS Authoritative replicas. Each connects to the same CNPG database.
|
|
dnssec:
|
|
type: boolean
|
|
default: true
|
|
description: |
|
|
DNSSEC ON (ECDSAP256SHA256) per #167 acceptance. Off requires explicit
|
|
override in cluster overlay AND a documented exception.
|
|
luaRecords:
|
|
type: boolean
|
|
default: true
|
|
description: PowerDNS Lua records — geo + health-checked failover. See docs/MULTI-REGION-DNS.md.
|
|
dnsdistEnabled:
|
|
type: boolean
|
|
default: true
|
|
description: Companion dnsdist for query rate-limiting (default 100 qps per source IP).
|
|
qpsPerSource:
|
|
type: integer
|
|
default: 100
|
|
description: dnsdist MaxQPSIPRule threshold per source IP.
|
|
placementSchema:
|
|
modes: [single-region, active-active]
|
|
default: active-active # the public NS endpoints span regions
|
|
manifests:
|
|
chart: ./chart
|
|
# Hard depends — these primitives MUST be present in-cluster before
|
|
# bp-powerdns reconciles cleanly:
|
|
# - postgresql.cnpg.io/v1.Cluster CRD (CNPG operator) consumed by
|
|
# templates/cnpg-cluster.yaml. Catalyst-Zero installs CNPG as a
|
|
# mandatory platform component (FABRIC group, componentGroups.ts
|
|
# `cnpg`); the wrapper Blueprint bp-cnpg is on the roadmap as a
|
|
# follow-up — until then we depend on the CRD being present, not
|
|
# on a sibling Blueprint.
|
|
# - cert-manager.io/v1.ClusterIssuer (letsencrypt-prod) referenced by
|
|
# templates/api-ingress.yaml. bp-cert-manager already exists.
|
|
# - traefik.io/v1alpha1.Middleware CRD — Traefik is the Catalyst-Zero
|
|
# ingress controller and is a fixture of every Sovereign.
|
|
depends:
|
|
- bp-cert-manager
|
|
upgrades:
|
|
from: ["0.x"]
|