openova/platform/cilium
e3mrah 69de64ba19
fix(cilium): k8sServiceHost 127.0.0.1 → 10.0.1.2 so workers' Cilium can reach apiserver (#738)
Issue #733 follow-up. The default cpx32 multi-node Sovereign (1 CP + 2
workers) provisioned successfully, but worker nodes stuck NotReady
because cilium-agent on workers crashloop'd:

  Get "https://127.0.0.1:6443/api/v1/namespaces/kube-system":
    dial tcp 127.0.0.1:6443: connect: connection refused

Root cause: `k8sServiceHost: 127.0.0.1` works on the k3s SERVER node
(supervisor binds localhost:6443) but FAILS on every k3s AGENT node
(agent does NOT expose apiserver on localhost — only the supervisor
on :6444). Pre-#733 every Sovereign was solo (worker_count=0), so
this never fired.

Fix: point Cilium at `10.0.1.2`, the CP's stable private IP on the
Sovereign's 10.0.1.0/24 subnet (cp1=10.0.1.2 per main.tf network
block). No-op on the CP (10.0.1.2 IS its own private IP) and works
on workers (which already join the cluster via the same address per
cloudinit-worker.tftpl `K3S_URL=https://${cp_private_ip}:6443`).

Files:
- infra/hetzner/cloudinit-control-plane.tftpl — bootstrap helm install
  values file written to /var/lib/catalyst/cilium-values.yaml
- platform/cilium/chart/values.yaml — Flux bp-cilium HelmRelease
  values (cilium_values_parity_test.go enforces the two stay aligned)

Verified live on otech50: 3× CPX32 servers running, 1 CP Ready, 2
workers registered with k3s but NotReady due to cilium init failure.
After this fix workers should reach Ready, and the Phase-1 watcher
sees all components Ready=True across the multi-node cluster.

Co-authored-by: hatiyildiz <hatiyildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:23:51 +04:00
..
chart fix(cilium): k8sServiceHost 127.0.0.1 → 10.0.1.2 so workers' Cilium can reach apiserver (#738) 2026-05-04 14:23:51 +04:00
blueprint.yaml fix(bp-cilium): upgrade upstream cilium 1.16.5 → 1.19.3 (1.2.0) (#684) 2026-05-03 17:20:54 +04:00
README.md refactor(platform): remove k8gb — replaced by PowerDNS lua-records (#171) 2026-04-29 08:51:09 +02:00

Cilium

Unified CNI + Service Mesh for Kubernetes with eBPF. Per-host-cluster infrastructure (see docs/PLATFORM-TECH-STACK.md §3.1) — installed on every host cluster Catalyst manages, before any other workload (CNI must come first).

Status: Accepted | Updated: 2026-04-27


Overview

Cilium provides both Container Network Interface (CNI) and Service Mesh capabilities in a single, unified solution. It replaces Flannel, kube-proxy, and eliminates the need for a separate service mesh like Istio.

flowchart TB
    subgraph Node["Kubernetes Node"]
        subgraph Cilium["Cilium"]
            Agent[Cilium Agent]
            eBPF[eBPF Programs]
            Hubble[Hubble]
            Envoy[Envoy Proxy<br/>L7 only]
        end

        subgraph Pods["Application Pods"]
            subgraph Pod1["Pod A"]
                App1[Application]
                OTel1[OTel Agent]
            end
            subgraph Pod2["Pod B"]
                App2[Application]
                OTel2[OTel Agent]
            end
        end
    end

    subgraph Observability["Observability"]
        Collector[OTel Collector]
        Grafana[Grafana Stack]
    end

    Agent --> eBPF
    eBPF -->|"L3/L4"| Pod1
    eBPF -->|"L3/L4"| Pod2
    Envoy -->|"L7 when needed"| Pod1
    Envoy -->|"L7 when needed"| Pod2

    OTel1 --> Collector
    OTel2 --> Collector
    Hubble --> Grafana
    Collector --> Grafana

Why Cilium?

CNI Comparison

Feature Flannel Cilium
Basic networking Yes Yes
Network policies No Yes (L3-L7)
Observability No Yes (Hubble)
kube-proxy replacement No Yes
eBPF performance No Yes
Service mesh No Yes (built-in)
L2 LoadBalancer No Yes
Gateway API No Yes

Service Mesh Comparison

Feature Istio Cilium Service Mesh
mTLS Envoy/ztunnel eBPF + WireGuard
L7 Policies Envoy CiliumEnvoyConfig
Network Policies Via CNI Native (eBPF)
Resource Overhead Medium-High Low
Observability Envoy metrics Hubble + OTel
CNI Integration Separate Unified
Complexity High Medium

Key Finding: OTel Independence

OpenTelemetry auto-instrumentation is independent of the service mesh:

Component Provides
OpenTelemetry SQL queries, HTTP calls, application traces
Service Mesh mTLS encryption, L7 routing, network policies

Cilium provides mesh features without sacrificing observability - OTel handles application tracing regardless of mesh choice.


Features

CNI Features

Feature Purpose
kubeProxyReplacement Replace kube-proxy with eBPF
hubble Network observability
hubble.relay Metrics export to Grafana
gatewayAPI Gateway API for ingress
encryption WireGuard-based mTLS
envoy L7 traffic management
l2announcements L2 LoadBalancer for bare-metal

Service Mesh Capabilities

Capability How Cilium Provides It
mTLS WireGuard encryption (transparent)
L7 Policies CiliumEnvoyConfig + Network Policies
Traffic Management Gateway API (HTTPRoute, GRPCRoute)
Observability Hubble metrics, flows, traces
Circuit Breaker CiliumEnvoyConfig outlier detection
Retries HTTPRoute retry policies

Configuration

Helm Values

# values.yaml
kubeProxyReplacement: true
k8sServiceHost: ${API_SERVER_IP}
k8sServicePort: 6443

# Service Mesh features
authentication:
  mutual:
    spire:
      enabled: false  # Using Cilium native

encryption:
  enabled: true
  type: wireguard

# L7 proxy
envoy:
  enabled: true

# Hubble observability
hubble:
  enabled: true
  relay:
    enabled: true
  ui:
    enabled: true
  metrics:
    enabled:
      - dns
      - drop
      - tcp
      - flow
      - icmp
      - http

# Gateway API
gatewayAPI:
  enabled: true

# L2 LoadBalancer (optional)
l2announcements:
  enabled: true

Network Policies

mTLS (Mutual TLS)

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: require-mtls
spec:
  endpointSelector:
    matchLabels:
      app: my-service
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: allowed-client
      authentication:
        mode: required

L7 HTTP Policies

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: http-policy
spec:
  endpointSelector:
    matchLabels:
      app: api-service
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: frontend
      toPorts:
        - ports:
            - port: "8080"
              protocol: TCP
          rules:
            http:
              - method: GET
                path: "/api/v1/.*"
              - method: POST
                path: "/api/v1/users"

Gateway API

Cilium replaces traditional ingress controllers with Gateway API:

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: cilium-gateway
spec:
  gatewayClassName: cilium
  listeners:
    - name: https
      port: 443
      protocol: HTTPS
      tls:
        mode: Terminate
        certificateRefs:
          - name: tls-secret
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: app-route
spec:
  parentRefs:
    - name: cilium-gateway
  hostnames:
    - "app.example.com"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /api
      backendRefs:
        - name: api-service
          port: 8080
      timeouts:
        request: 30s
      retry:
        attempts: 3
        backoff: 100ms

Resilience Patterns

Circuit Breaker Tiers

Service Tier consecutiveErrors baseEjectionTime maxEjectionPercent
Critical 3 30s 50%
Standard 5 30s 50%
Background 10 60s 75%

Standard Service Circuit Breaker

apiVersion: cilium.io/v2
kind: CiliumEnvoyConfig
metadata:
  name: service-circuit-breaker
spec:
  services:
    - name: my-service
      namespace: default
  resources:
    - "@type": type.googleapis.com/envoy.config.cluster.v3.Cluster
      name: my-service
      connect_timeout: 5s
      circuit_breakers:
        thresholds:
          - priority: DEFAULT
            max_connections: 100
            max_pending_requests: 100
            max_requests: 1000
            max_retries: 3
      outlier_detection:
        consecutive_5xx: 5
        interval: 10s
        base_ejection_time: 30s
        max_ejection_percent: 50

Critical Service (Database Proxy)

apiVersion: cilium.io/v2
kind: CiliumEnvoyConfig
metadata:
  name: db-proxy-circuit-breaker
spec:
  services:
    - name: db-proxy
      namespace: default
  resources:
    - "@type": type.googleapis.com/envoy.config.cluster.v3.Cluster
      name: db-proxy
      connect_timeout: 3s
      circuit_breakers:
        thresholds:
          - priority: DEFAULT
            max_connections: 50
            max_pending_requests: 50
            max_retries: 2
      outlier_detection:
        consecutive_5xx: 3
        interval: 5s
        base_ejection_time: 30s
        max_ejection_percent: 50

Timeout Configuration

apiVersion: cilium.io/v2
kind: CiliumEnvoyConfig
metadata:
  name: service-timeouts
spec:
  services:
    - name: my-service
      namespace: default
  resources:
    - "@type": type.googleapis.com/envoy.config.route.v3.RouteConfiguration
      name: my-service-routes
      virtual_hosts:
        - name: my-service
          domains: ["*"]
          routes:
            - match:
                prefix: "/"
              route:
                cluster: my-service
                timeout: 30s

Health Probes

Probe Type Purpose Failure Action
Startup Wait for initialization Block traffic
Readiness Can accept traffic Remove from LB
Liveness Process healthy Restart pod

Standard Configuration

startupProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5
  failureThreshold: 30

readinessProbe:
  httpGet:
    path: /health/ready
    port: 8080
  periodSeconds: 10
  failureThreshold: 3

livenessProbe:
  httpGet:
    path: /health/live
    port: 8080
  periodSeconds: 20
  failureThreshold: 3

SLO-Based Alerting

SLI Target Alert Threshold
Availability 99.9% <99.5% for 5m
Latency (p95) <500ms >1s for 5m
Error Rate <0.1% >1% for 5m

LoadBalancer Options

flowchart TB
    subgraph Options["LoadBalancer Options"]
        subgraph CloudLB["Cloud LB (Recommended)"]
            HetznerLB[Hetzner LB]
            OCILB[OCI LB]
        end

        subgraph PdnsLB["PowerDNS lua-records (Free)"]
            PDNS[PowerDNS authoritative + ifurlup/pickclosest]
            ExternalDNS[ExternalDNS]
        end

        subgraph CiliumLB["Cilium L2 (Single Subnet)"]
            L2[L2 Announcements]
        end
    end

    CloudLB -->|"Best for cloud"| Service[K8s Service]
    PdnsLB -->|"Best for multi-region"| Service
    CiliumLB -->|"Best for single node"| Service
Option Cost Multi-Region Use Case
Cloud LB Paid Via PowerDNS lua-records Production
PowerDNS lua-records Free Native Cost-sensitive
Cilium L2 Free No Single subnet/dev

OpenTelemetry Integration

OTel is configured independently of Cilium:

apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
  name: auto-instrumentation
  namespace: opentelemetry
spec:
  exporter:
    endpoint: http://collector.opentelemetry.svc:4317
  propagators:
    - tracecontext
    - baggage
  sampler:
    type: parentbased_traceidratio
    argument: "1.0"
  java:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
  python:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest
  nodejs:
    image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest

Pods opt-in via annotation:

metadata:
  annotations:
    instrumentation.opentelemetry.io/inject-java: "true"

Observability Stack

Source Provides
Hubble (Cilium) Network flows, DNS queries, HTTP metrics (L7), dropped packets
OpenTelemetry Application traces, SQL queries, HTTP client calls, custom spans
Grafana Stack Loki (logs), Mimir (metrics), Tempo (traces), dashboards

Migration from Istio

If migrating from Istio:

  1. Install Cilium with service mesh features
  2. Remove Istio sidecars (if using sidecar mode)
  3. Convert VirtualService → HTTPRoute
  4. Convert DestinationRule → CiliumNetworkPolicy + CiliumEnvoyConfig
  5. Verify OTel continues working (it's independent)

Consequences

Positive:

  • Unified CNI + service mesh (single component)
  • Advanced network policies (L3-L7)
  • Network observability via Hubble
  • eBPF performance
  • kube-proxy replacement
  • L2 LoadBalancer option
  • Native Kubernetes Gateway API support
  • Lower resource overhead than Istio

Negative:

  • More complex than Flannel
  • eBPF kernel requirements (5.10+)
  • Less mature than Istio for some features
  • CiliumEnvoyConfig less intuitive than VirtualService

Part of OpenOva