openova/platform/flux/blueprint.yaml
e3mrah 05cb39c042
fix(bp-flux): catalyst-cluster-reconciler ClusterRoleBinding overlay (closes #338) (#393)
PROBLEM
-------
On Sovereign-1 (otech.omani.works, 2026-04-30) every HelmRelease that
transitioned through pending-install/pending-upgrade got stuck because
the helm-controller SA could not UPDATE its own helm-storage Secrets
(sh.helm.release.v1.<name>.<n>) in flux-system. Symptom:

  secrets "sh.helm.release.v1.catalyst-platform.v1" is forbidden:
  User "system:serviceaccount:flux-system:helm-controller" cannot
  update resource "secrets" in API group "" in the namespace "flux-system"

Runtime workaround on otech (added 2026-04-30): manual ClusterRoleBinding
flux-system-helm-controller-admin → cluster-admin → flux-system/helm-controller.
Tracked as the permanent fix in #338.

FIX
---
Add platform/flux/chart/templates/catalyst-cluster-reconciler-rbac.yaml — a
Catalyst-managed ClusterRoleBinding (catalyst-cluster-reconciler) that
binds cluster-admin to helm-controller AND kustomize-controller in
.Values.catalyst.fluxNamespace (default flux-system). Independent from
the upstream subchart's cluster-reconciler binding (different name, no
ownership conflict), so if the upstream binding ever drifts again the
overlay still holds the cluster correct.

WHY cluster-admin (not narrower)
--------------------------------
helm-controller installs arbitrary user-supplied Helm charts which can
ship any K8s resource (CRDs, ClusterRoles, MutatingWebhookConfigurations,
etc.). There is no narrower role that satisfies the full install path.
The Flux project's own bootstrap install.yaml binds cluster-admin for
the same reason (upstream default multitenancy.privileged=true).
Multi-tenancy lockdown is a Sovereign Day-2 hardening choice tracked
separately.

NEVER-HARDCODE COMPLIANCE
-------------------------
Per docs/INVIOLABLE-PRINCIPLES.md #4, the namespace is operator-overridable
via .Values.catalyst.fluxNamespace. Default is flux-system because that's
the canonical Catalyst install namespace (matches cloud-init's flux2
install.yaml + clusters/_template/bootstrap-kit/03-flux.yaml).

VERSION
-------
- bp-flux 1.1.2 → 1.1.3 (Chart.yaml + blueprint.yaml + 3 bootstrap-kit refs).
- The flux2 subchart pin (2.14.1) is unchanged — version-pin replay test
  remains green (cloud-init v2.4.0 == subchart appVersion 2.4.0).

VERIFICATION
------------
- platform/flux/chart/tests/version-pin-replay.sh — all 6 cases PASS.
- platform/flux/chart/tests/observability-toggle.sh — all 3 cases PASS.
- helm template renders the new ClusterRoleBinding with correct subjects
  (flux-system by default; verified --set catalyst.fluxNamespace=custom
  override path).
- scripts/check-bootstrap-deps.sh — 0 drift, 0 cycles.

FILES
-----
- platform/flux/chart/templates/catalyst-cluster-reconciler-rbac.yaml (new)
- platform/flux/chart/Chart.yaml (1.1.2 → 1.1.3)
- platform/flux/chart/values.yaml (catalyst.fluxNamespace default)
- platform/flux/blueprint.yaml (1.1.2 → 1.1.3)
- clusters/{_template,otech.omani.works,omantel.omani.works}/bootstrap-kit/03-flux.yaml (chart version)
- docs/lessons-learned/helm-controller-rbac.md (permanent-fix note)
- docs/omantel-handover-wbs.md (#338 status row)

Refs: #43 #369 #338
Lesson: docs/lessons-learned/helm-controller-rbac.md

Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com>
2026-05-01 15:56:45 +04:00

16 lines
482 B
YAML

apiVersion: catalyst.openova.io/v1alpha1
kind: Blueprint
metadata:
name: bp-flux
labels:
catalyst.openova.io/section: pts-3-2-gitops-and-iac
spec:
version: 1.1.3
card:
title: flux
summary: GitOps reconciler. One per vcluster (source + kustomize + helm controllers); host-level Flux per host cluster watches its bp-* OCI sources.
visibility: unlisted # mandatory infra, auto-installed by bootstrap kit
manifests:
chart: ./chart
depends: []