openova/platform/reloader
e3mrah 9a58289786
fix(catalyst-api,bp-reloader): tofu state on PVC + Reloader annotations strategy (closes #715) (#716)
* fix(catalyst-api,bp-keycloak): handover 401 root-causes — Reloader annot + realm SA users array (#713)

Closes #713

Two distinct chart bugs surfaced live on otech62 (2026-05-03), both producing
401 on /auth/handover:

1. SOVEREIGN_FQDN race
   api-deployment.yaml reads SOVEREIGN_FQDN from ConfigMap "sovereign-fqdn"
   with optional:true. On Sovereigns, that ConfigMap is rendered by the
   sovereign-tls Flux Kustomization concurrently with bp-catalyst-platform
   HelmRelease. When the Pod starts first, valueFrom collapses to "" and
   stays empty — audience check rejects every valid token as "invalid
   audience". Fix: add Reloader annotations so the Pod rolls when the
   ConfigMap (and the handover-jwt-public Secret) appears.

2. catalyst-api-server SA missing user-level realm-management role mappings
   bp-keycloak realm import granted roles via clientScopeMappings — wrong
   level. The actual service-account user had no clientRoles entry, so KC
   rejected GET /users with 403 when catalyst-api tried to ensure the
   operator user during handover. Fix: add explicit "users" array binding
   service-account-catalyst-api-server to realm-management.{impersonation,
   manage-users, view-users, query-users}.

* fix(catalyst-api,bp-reloader): tofu state on PVC + Reloader annotations strategy (#715)

Closes #715

Two architectural bugs surfaced live on otech64 (2026-05-03), both leading
to a healthy-looking Sovereign that the operator could not reach.

1. catalyst-api tofu workdir on emptyDir
   CATALYST_TOFU_WORKDIR=/tmp/catalyst/tofu (emptyDir). When contabo's
   catalyst-api Pod rolled mid-apply (the PR #714 deploy commit triggered
   a rolling restart 3 minutes into otech64's tofu run), in-progress state
   was lost. Tofu had created LB/network/server/services but not the
   hcloud_load_balancer_target.control_plane resource yet — the cluster
   came up at the k3s level but the public LB had no targets, returning
   TLS handshake failure for every console.<sov> request.

   Move CATALYST_TOFU_WORKDIR to /var/lib/catalyst/tofu (PVC-backed,
   fsGroup=65534 already wires write access). tofu apply resumes from
   where it left off after any Pod restart.

2. bp-reloader env-vars strategy
   reloadStrategy=env-vars only injects checksum env vars for ConfigMaps
   referenced via envFrom. Workloads using valueFrom: configMapKeyRef
   (catalyst-api's SOVEREIGN_FQDN) are silently not reloaded — the
   configmap.reloader.stakater.com/reload annotation added in PR #714
   was a no-op under env-vars.

   Switch to reloadStrategy=annotations. Reloader bumps a pod-template
   annotation, triggering rollout regardless of how the CM/Secret is
   referenced.

---------

Co-authored-by: hatiyildiz <hatiyildiz@openova.io>
2026-05-04 02:04:26 +04:00
..
chart fix(catalyst-api,bp-reloader): tofu state on PVC + Reloader annotations strategy (closes #715) (#716) 2026-05-04 02:04:26 +04:00
blueprint.yaml feat(platform): security umbrellas (falco/kyverno/trivy/sigstore/syft-grype/reloader/coraza/litmus) (#216) 2026-04-30 06:07:38 +02:00
README.md docs(pass-10): banners on 7 more components + opentofu active-active drift fix 2026-04-27 21:43:45 +02:00

Reloader

Auto-restart Pods when ConfigMap/Secret hashes change. Per-host-cluster infrastructure (see docs/PLATFORM-TECH-STACK.md §3.4) — runs on every host cluster Catalyst manages. Critical for Catalyst's secret-rotation flow: when ESO updates a K8s Secret from OpenBao, Reloader triggers a rolling deploy of consuming Pods (see docs/SECURITY.md §3).

Category: Operations | Type: Mandatory per host cluster


Overview

Reloader watches for changes to ConfigMaps and Secrets, then triggers rolling restarts of associated Deployments, StatefulSets, and DaemonSets. Eliminates the operational gap where configuration changes require manual pod restarts.

Key Features

  • Automatic rolling restart on ConfigMap/Secret changes
  • Annotation-based opt-in per workload
  • SHA-based change detection (no unnecessary restarts)
  • Minimal resource footprint

Integration

Component Integration
External Secrets (ESO) Restart pods when secrets rotate
OpenBao Secret rotation triggers pod refresh
cert-manager Certificate renewal triggers restart
Flux GitOps config changes auto-propagate

Deployment

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: reloader
  namespace: flux-system
spec:
  interval: 10m
  path: ./platform/reloader
  prune: true

Part of OpenOva