Root cause (otech22 2026-05-02): Cilium operator checks for Gateway API
CRDs at startup and disables its gateway controller if they are absent —
a static, one-shot decision. Cloud-init installs k3s+Cilium first, then
Flux reconciles bp-gateway-api minutes later, so the operator always
starts without CRDs and never recovers. All 8 HTTPRoutes orphaned.
Three-part permanent fix:
1. cloud-init: apply Gateway API v1.1.0 experimental CRDs (incl.
TLSRoute) BEFORE the Cilium helm install. Cilium 1.16.x requires
TLSRoute CRD to be present; without it the operator's capability
check fails entirely and disables the gateway controller.
2. bp-cilium (1.1.2 → 1.1.3): add gatewayAPI.gatewayClass.create: "true"
to force GatewayClass creation regardless of CRD presence at Helm
render time. Upstream default "auto" skips GatewayClass when the
gateway API CRDs are absent at install time (Capabilities check).
3. bp-gateway-api (1.0.0 → 1.1.0): downgrade CRDs from v1.2.0 to v1.1.0
and ship experimental channel (TLSRoute, TCPRoute, UDPRoute,
BackendLBPolicy, BackendTLSPolicy). Gateway API v1.2.0 changed
status.supportedFeatures from string[] to object[]; Cilium 1.16.5
writes the old string format and the v1.2.0 CRD rejects the status
patch with "must be of type object: string", leaving GatewayClass
permanently Unknown/Pending. v1.1.0 retains string schema.
Upgrade path: bump bp-gateway-api + bp-cilium together when Cilium ≥ 1.17
adopts the v1.2.0 object schema for supportedFeatures.
Closes#503
Co-authored-by: hatiyildiz <hatiyildiz@openova.io>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>