Commit Graph

10 Commits

Author SHA1 Message Date
e3mrah
ad9cfc0f23
feat(platform): add global.imageRegistry to bp-openbao/external-secrets/cnpg/valkey/nats-jetstream/powerdns/gitea (PR 2/3, #560) (#565)
Charts with template image refs (fully rewritten when registry set):
- bp-openbao 1.2.4→1.2.5: init-job.yaml + auth-bootstrap-job.yaml — Catalyst
  job images now prefixed with global.imageRegistry when non-empty. Default
  (empty) renders identical manifests.
- bp-powerdns 1.1.5→1.1.6: dnsdist.yaml Catalyst companion image prefixed
  with global.imageRegistry when non-empty. Verified: dnsdist image rewrites
  to harbor.openova.io/docker.io/powerdns/dnsdist-19:1.9.14.

Subchart-only charts (global.imageRegistry stub added; threading via per-component
subchart values.yaml keys documented in comments):
- bp-external-secrets 1.1.0→1.1.1
- bp-cnpg 1.0.0→1.0.1  (charts/ missing = pre-existing state, not this PR)
- bp-valkey 1.0.0→1.0.1 (charts/ missing = pre-existing state, not this PR)
- bp-nats-jetstream 1.1.1→1.1.2
- bp-gitea 1.1.2→1.1.3: upstream chart exposes gitea.image.registry for wiring

vcluster: N/A — no chart directory under platform/vcluster/chart/

Co-authored-by: alierenbaysal <alierenbaysal@openova.io>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 12:52:43 +04:00
e3mrah
92b7db622d
fix(bp-external-secrets-stores): split ClusterSecretStore into separate chart per #247 pattern (closes #331) (#426)
* fix(bp-external-secrets): split ClusterSecretStore into bp-external-secrets-stores chart (resolves CRD ordering, closes #331)

bp-external-secrets@1.0.0 deadlocked on first install on otech.omani.works:

  Helm install failed for release external-secrets-system/external-secrets
  with chart bp-external-secrets@1.0.0:
  failed post-install: unable to build kubernetes object for deleting hook
  bp-external-secrets/templates/clustersecretstore-vault-region1.yaml:
  resource mapping not found for name: "vault-region1" namespace: ""
  no matches for kind "ClusterSecretStore" in version "external-secrets.io/v1beta1"

Root cause: Helm's `helm.sh/hook-delete-policy: before-hook-creation` ran
a kubectl-style lookup of the existing ClusterSecretStore CR before the
upstream `external-secrets` subchart's CRDs finished registration. The
in-line ClusterSecretStore template (templates/clustersecretstore-vault-
region1.yaml) and the upstream subchart's CRDs co-installed in the same
release; admission ordering wasn't deterministic enough to make the
post-install hook safe.

Fix — same pattern as PR #247 (bp-crossplane@1.1.3 ↔ bp-crossplane-claims@1.0.0):
split the chart into controller + stores. Flux dependsOn orders them.

  - bp-external-secrets@1.1.0 — controller-only (just upstream subchart
    + NetworkPolicy + ServiceMonitor toggle). CRDs register here.
  - bp-external-secrets-stores@1.0.0 (NEW) — the default
    ClusterSecretStore CR; depends on bp-external-secrets being Ready.
    No Helm hooks needed: by the time this chart's HelmRelease starts,
    Flux has already verified bp-external-secrets is Ready=True and
    therefore the CRDs are registered.

Files:
  NEW: platform/external-secrets-stores/blueprint.yaml             (1.0.0)
  NEW: platform/external-secrets-stores/chart/Chart.yaml           (1.0.0; no upstream subchart, annotation `catalyst.openova.io/no-upstream: "true"`)
  NEW: platform/external-secrets-stores/chart/values.yaml          (clusterSecretStore.* knobs moved from controller chart)
  MOVED: platform/external-secrets/chart/templates/clustersecretstore-vault-region1.yaml
       → platform/external-secrets-stores/chart/templates/clustersecretstore-vault-region1.yaml
       (Helm hook annotations removed — Flux dependsOn now handles ordering)
  TOUCHED: platform/external-secrets/chart/Chart.yaml              (1.0.0 → 1.1.0; description note appended)
  TOUCHED: platform/external-secrets/blueprint.yaml                (1.0.0 → 1.1.0)
  TOUCHED: platform/external-secrets/chart/values.yaml             (clusterSecretStore block removed; pointer comment added)
  NEW: clusters/_template/bootstrap-kit/15a-external-secrets-stores.yaml
       (Flux HelmRelease, dependsOn: [bp-external-secrets, bp-openbao])
  TOUCHED: clusters/_template/bootstrap-kit/15-external-secrets.yaml
       (chart version 1.0.0 → 1.1.0)
  TOUCHED: clusters/_template/bootstrap-kit/kustomization.yaml
       (slot 15a inserted after 15)

Out of scope for this PR (separate tickets):
  - blueprint-release.yaml CI fan-out: verify the path-matrix picks up
    the new platform/external-secrets-stores/ directory automatically;
    if not, add the directory to the matrix in a follow-up.
  - Per-Sovereign cluster directory edits (#257 will delete those).
  - Phase 0 minimum trim (#310 will renumber slots; this PR uses 15a as
    a non-disruptive sub-slot insertion that works with both the current
    35-slot kustomization and the eventual 15-slot canonical layout —
    when #310 renumbers, 15 + 15a become 08 + 09 in the canonical order).

Refs: #331 (this issue), #247 (pattern reference — bp-crossplane split),

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(scripts): register bp-external-secrets-stores in expected-bootstrap-deps.yaml

The dependency-graph-audit CI step rejected PR #334 because the new
bp-external-secrets-stores HR was on disk at slot 15a but missing from
the expected DAG. This commit adds it with the same dependsOn shape as
clusters/_template/bootstrap-kit/15a-external-secrets-stores.yaml:
[bp-external-secrets, bp-openbao].

Refs: #331, #310 (Phase 0 minimum), PR #334.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(bp-external-secrets): retire CR cases from controller test, add stores-toggle (#331)

After splitting the default ClusterSecretStore into bp-external-secrets-stores
@1.0.0, the controller chart's observability-toggle integration test still
expected the CR to render in the controller chart (Cases 4 + 5). Those
assertions now belong on the new chart.

Changes:
  - platform/external-secrets/chart/tests/observability-toggle.sh:
    Replace Cases 4+5 with a single inverted assertion — the controller
    chart MUST render ZERO ClusterSecretStore CRs (top-level kind:); only
    the upstream subchart's CRD definition (whose spec.names.kind value is
    "ClusterSecretStore" at non-zero indent) is allowed.
  - platform/external-secrets-stores/chart/tests/clustersecretstore-toggle.sh:
    NEW. Mirrors the retired Cases 4+5 against the stores chart, plus a
    Case 3 that asserts clusterSecretStore.server overrides propagate.

Local smoke:
  bash platform/external-secrets/chart/tests/observability-toggle.sh         → 4/4 PASS
  bash platform/external-secrets-stores/chart/tests/clustersecretstore-toggle.sh → 3/3 PASS

Refs: #331, PR #334.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(scripts): handle alphanumeric sub-slot suffixes in check-bootstrap-deps.sh

PR #334 (issue #331) added slot 15a-external-secrets-stores as a sub-slot
between numeric slots 15 and 16. The bootstrap-deps audit script's
`printf '%02d'` formatter rejected `15a` with:

  scripts/check-bootstrap-deps.sh: line 390: printf: 15a: invalid number

Fix: detect non-numeric slot tokens and pass them through verbatim. Numeric
slots still render as zero-padded `01..49` for output alignment.

Local smoke:
  $ bash scripts/check-bootstrap-deps.sh
  ...
    [P] slot 15  bp-external-secrets        <-- bp-cert-manager bp-openbao
    [P] slot 15a bp-external-secrets-stores <-- bp-external-secrets bp-openbao
  ...
  OK: bootstrap-kit dependency graph audit PASSED

Refs: #331, PR #334.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(wbs): tick #331 chart-released

bp-external-secrets@1.1.0 (controller-only) + bp-external-secrets-stores@1.0.0
(NEW) shipped in PR #426. Helm-template acceptance + both toggle tests +
dependency-graph-audit all green. Sovereign-impact deferred to Phase 8.

Refs: #331, PR #426.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Hatice Yildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: hatiyildiz <hatiyildiz@noreply.github.com>
2026-05-01 17:33:47 +04:00
e3mrah
9554be4a5e
fix(bp-external-secrets): gate ClusterSecretStore on CRD presence + drop delete-policy (#337)
The chart's post-install hook was failing on otech.omani.works:
  failed post-install: unable to build kubernetes object for deleting hook
  bp-external-secrets/templates/clustersecretstore-vault-region1.yaml:
  resource mapping not found for kind ClusterSecretStore in version
  external-secrets.io/v1beta1
Two corrections:
1. Capabilities-gate the entire template — don't render unless the
   ClusterSecretStore CRD is registered (it ships in via the upstream
   ESO subchart but isn't live on first install)
2. Remove 'before-hook-creation' delete-policy (was the actual trigger
   for the 'deleting hook' failure path)
Bumped 1.0.0 → 1.0.1.

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
2026-04-30 23:31:24 +04:00
e3mrah
9dc8506dd9
feat(charts): bp-external-secrets + bp-cnpg + bp-valkey wrapper charts (#285)
Storage-substrate batch (W2.5.A) — closes #254 by shipping the three
upstream-subchart umbrella Blueprints that the Flux HRs at
clusters/_template/bootstrap-kit/{15-external-secrets,16-cnpg,17-valkey}
.yaml (merged via PR #262) target.

Each chart follows the canonical umbrella pattern documented in
docs/BLUEPRINT-AUTHORING.md §11.1: Chart.yaml declares the upstream
chart under `dependencies:` so `helm dependency build` bundles the
upstream payload into the OCI artifact, and Catalyst-curated overlay
values + templates sit alongside in chart/values.yaml + chart/templates/.

Per-chart highlights:
- bp-external-secrets/1.0.0 — wraps external-secrets/external-secrets
  0.10.7. Ships a default `vault-region1` ClusterSecretStore (via Helm
  post-install/post-upgrade hook to defer the CR application until the
  upstream chart's CRDs are registered) wired to the in-cluster
  bp-openbao service. clusterSecretStore.enabled toggle lets cluster
  overlays opt out and author their own multi-region CRs.
- bp-cnpg/1.0.0 — wraps cnpg/cloudnative-pg 0.28.0. Operator-only
  surface (Cluster CRs are per-Application). CRDs ship in-chart so
  bp-powerdns / bp-keycloak / bp-gitea / bp-langfuse / bp-grafana /
  bp-temporal / bp-matrix / bp-llm-gateway / bp-bge / bp-nemo-guardrails
  / bp-openmeter / pool-domain-manager can `dependsOn: bp-cnpg` via
  Flux — closing #254 (bp-powerdns CreateContainerConfigError on
  pdns-pg-app secret).
- bp-valkey/1.0.0 — wraps bitnami/valkey 5.5.1. BSD-3 Redis-compatible
  cache, replication architecture, password auth ON, NetworkPolicy ON,
  replicas 0 by default for solo Sovereigns (cluster overlays bump for
  HA). Application-tier cache only — Catalyst control plane uses NATS
  JetStream KV (per ARCHITECTURE.md §5).

Per docs/BLUEPRINT-AUTHORING.md §11.2 (issue #182): every observability
toggle defaults `false` (ServiceMonitor / PodMonitor / PrometheusRule /
metrics sidecar) and is operator-tunable via per-cluster overlay once
bp-kube-prometheus-stack reconciles. Each chart ships
tests/observability-toggle.sh covering default-off, opt-in (--api-versions
monitoring.coreos.com/v1 to simulate the CRDs), and explicit-off cases.

Per docs/INVIOLABLE-PRINCIPLES.md #4 (never hardcode): every upstream
version, namespace, server URL, role, and password toggle is exposed
under values.yaml. Cluster overlays in clusters/<sovereign>/ may
override without rebuilding the Blueprint OCI artifact.

helm lint: 1 chart(s) linted, 0 chart(s) failed (each, INFO icon-recommended only)
helm template default render kinds:
  bp-external-secrets: ClusterRole, ClusterRoleBinding, ClusterSecretStore, CustomResourceDefinition, Deployment, Role, RoleBinding, Secret, Service, ServiceAccount, ValidatingWebhookConfiguration
  bp-cnpg:             ClusterRole, ClusterRoleBinding, ConfigMap, CustomResourceDefinition, Deployment, MutatingWebhookConfiguration, Service, ServiceAccount, ValidatingWebhookConfiguration
  bp-valkey:           ConfigMap, NetworkPolicy, PodDisruptionBudget, Secret, Service, ServiceAccount, StatefulSet

Closes #254

Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
2026-04-30 18:39:29 +04:00
hatiyildiz
bc9b90d989 docs(pass-35): completion sweep for surviving DNS placeholders (8 components)
Started as gitea + relay atomic check. The gitea fix surfaced surviving
<domain> placeholders across 8 other component READMEs that prior sweeps
(Pass 29: canonical docs, Pass 32: image registries) hadn't covered.

Catalyst control-plane DNS fixes (-> {component}.<location-code>.<sovereign-domain>):
- gitea: GITEA_INSTANCE_URL.
- external-secrets: openbao ClusterSecretStore + gitea Flux GitRepository.

Application DNS fixes (-> {app}.<env>.<sovereign-domain>):
- temporal: had two drift items in one line — temporal.fuse.<domain>
  (old "fuse" product name + wrong placeholder shape). Pass 32 fixed
  the image ref on the same file but missed this. Now fully de-drifted.
- valkey: --replicaof valkey.region1.<domain> (non-canonical region1
  segment — Catalyst encodes regions in location-code).
- strimzi: kafka-kafka-bootstrap.region1.<domain>:9092 — same.
- cnpg: postgres.region1.<domain> cross-region replica host — same.
- stunner: STUN/TURN realm — kept canonical Application form for
  consistency even though STUN realms are nominally opaque.
- k8gb: Gslb ingress host app.gslb.<domain> -> app.gslb.<sovereign-domain>.
  Other illustrative k8gb refs (dnsZone, nslookup examples) preserved
  as they describe behavior generically.

products/relay/README.md: clean.

Preserved as correctly-generic: external-dns illustrative refs,
cert-manager <domain> (customer-supplied cert names), stalwart <domain>
(customer email-receiving domain).

Validation log Pass 35 entry: third end-to-end DNS sweep iteration
(29 -> 32 -> 35). Future passes should grep for bare <domain> early to
catch new instances introduced during edits.
2026-04-27 22:46:16 +02:00
hatiyildiz
42aeb629bb docs(pass-7): rewrite OpenBao + ESO READMEs to match agreed multi-region semantics
Pass 7 — line-by-line read of platform/openbao/README.md and
platform/external-secrets/README.md found a major architectural drift:
both files described an OLD active-active bidirectional sync model
that contradicts docs/SECURITY.md §5 (the canonical reference).

The active-active design was rejected during the architecture session
because it would have been a stretched cluster — a single region's
network blip would block writes everywhere. The agreed model is:

- Independent Raft cluster per region (intra-region quorum only).
- Single-primary writes; replicas accept reads only.
- Async Performance Replication primary → replicas (lag <1s typical).
- Explicit DR promotion (sovereign-admin or failover-controller).

Fixes:

platform/openbao/README.md:
- Overview: removed "active-active deployments" / "either region can
  update secrets". Replaced with "independent Raft cluster per region",
  "asynchronous Performance Replication".
- Architecture diagram: replaced bidirectional-push diagram with the
  primary→replicas async perf replication topology that matches
  SECURITY.md §5.
- ClusterSecretStores: simplified from "two stores (local+remote)" to
  "one local store"; reads always pull locally.
- Renamed "PushSecret (Bidirectional)" → "Writes go to the primary
  region" with a single-target PushSecret pointing at bao-primary.
- Added DR promotion section pointing at SECURITY.md §5.2.
- Status banner: notes that the canonical multi-region reference is
  SECURITY.md.

platform/external-secrets/README.md:
- Header line: repositioned as per-host-cluster infrastructure with
  pointer to PLATFORM-TECH-STACK §3.3.
- Removed broken link to non-existent ../openbao/docs/ADR-OPENBAO.md
  (replaced with link to ../openbao/README.md).
- "Multi-region sync | Push to both OpenBao instances simultaneously"
  → "Multi-region reads | Async perf replication".
- "PushSecret to Multiple OpenBao Instances" example was writing to
  two ClusterSecretStores in parallel — replaced with single-target
  primary write.
- "Multi-region sync via single PushSecret" in Consequences →
  "Cross-region availability via Performance Replication".
- Mermaid sequence diagram: "Bootstrap Wizard" actor → "Catalyst
  Bootstrap (Phase 0)"; "Terraform" → "OpenTofu"; ESO connection
  description "via K8s auth" → "via SPIFFE SVID (workload identity)".

These were the most consequential drift fixes found in any pass —
two READMEs were documenting an architecture explicitly rejected by
the agreed model.

Refs #37
2026-04-27 21:34:09 +02:00
hatiyildiz
d6a51b8a7a docs(pass-2): final entity-noun sweep — external-secrets sequence diagram
Pass 2 — fresh-eyes sweep across the entire docs tree. One residual
entity-noun usage found:

- platform/external-secrets/README.md:75 (in a Mermaid sequence
  diagram): "Note over Wizard: Operator saves unseal keys offline"
  — "Operator" used as person/entity. Renamed to "sovereign-admin"
  to match the role from GLOSSARY.md.

All other banned-term sweeps clean:
- No tenant (architectural) anywhere.
- No Catalyst IDP anywhere.
- No Synapse-as-product anywhere (only the legitimate
  "Matrix/Synapse server" usages).
- No workspace-controller (only the banned-term entries that define
  the rename).
- No capital-W Workspace as Catalyst scope.
- No github.com/openova (without -io).
- All cross-doc Markdown links resolve.
- All §X references resolve to the new section numbering after
  PLATFORM-TECH-STACK reorg.
- API group catalyst.openova.io/v1alpha1 consistent across 6 references.
- OCI artifact prefix `bp-` consistent across README, CLAUDE,
  BLUEPRINT-AUTHORING, IMPLEMENTATION-STATUS.

Other "Operator" mentions intentionally retained (legitimate
technical usage):
- "External Secrets Operator (ESO)", "Trivy Operator" — K8s
  Operator pattern (controllers), explicitly allowed by GLOSSARY.
- "Operator compatibility" in BUSINESS-STRATEGY's OpenShift migration
  table — refers to compatibility with K8s Operators (the technology),
  not as an entity/role.

Refs #37
2026-04-27 21:18:55 +02:00
hatiyildiz
119a1e53a0 docs(components): terminology pass across platform and product READMEs
Bring per-component READMEs in line with the canonical glossary
(docs/GLOSSARY.md). Substantive architectural content unchanged —
this is a terminology + reference correctness pass.

Placeholder rename: <tenant> → <org> in YAML / IaC examples across
- platform/cnpg/README.md           (Cluster + Pooler + ScheduledBackup)
- platform/debezium/README.md       (PostgreSQL connector + topic patterns)
- platform/external-secrets/README.md (ExternalSecret / SecretStore)
- platform/grafana/README.md        (Instrumentation namespace)
- platform/k8gb/README.md           (Gslb + namespace + kubectl examples)
- platform/keda/README.md           (ScaledObject + Kafka triggers + Prometheus)
- platform/opentofu/README.md       (server resource example)
- platform/velero/README.md         (BackupStorageLocation buckets)
- platform/vpa/README.md            (VerticalPodAutoscaler examples)
- platform/flux/README.md           (kustomization name + tenants/ → organizations/)

"Catalyst IDP" → "Catalyst console":
- platform/crossplane/README.md     (integration section retitled and
                                      rewritten — Crossplane is platform
                                      plumbing, not user-facing)
- platform/gitea/README.md          (architecture diagram + integration table)
- platform/kyverno/README.md        (rollout tracking surface)
- products/fingate/README.md        (TPP onboarding portal)

"Bootstrap wizard" → "Catalyst bootstrap":
- platform/openbao/README.md        (bootstrap procedure rewritten —
                                      independent Raft per region clarified;
                                      cross-references docs/SECURITY.md §5)
- platform/opentofu/README.md       (Quick Start)

Kyverno labels & prose:
- openova.io/tenant → openova.io/organization (label rename for
  consistency; deployed clusters will add new label as a co-label
  during migration window)
- "tenant labels" / "tenant namespace" prose updated to
  "Organization labels" / "Organization-labeled namespace"
- Priority class names (tenant-high, tenant-default, tenant-batch)
  retained as deployed artifact names — rename pending in a
  separate migration ticket

No banned-term hits remain in component READMEs (verified by grep
in docs/GLOSSARY.md banned-terms table).

Refs #37
2026-04-27 20:06:51 +02:00
talent-mesh
10245dff98 feat: ecosystem expansion to 55 components with license compliance
- Replace BSL-licensed components with open-source alternatives:
  Terraform→OpenTofu (MPL 2.0), Vault→OpenBao (MPL 2.0),
  Redpanda→Strimzi/Kafka (Apache 2.0), n8n→Airflow (Apache 2.0)
- Add 14 new platform components: activemq, camel, clickhouse, dapr,
  debezium, falco, flink, iceberg, opensearch, rabbitmq, superset,
  temporal, trino, vitess
- Rename meta-platforms/ to products/ with new product names:
  Cortex (AI Hub), Fingate (Open Banking), Titan (Data Lakehouse),
  Fuse (Microservices Integration)
- Update all documentation, READMEs, and cross-references

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-11 18:15:11 +00:00
talent-mesh
c9d04a53b4 refactor: flatten platform/ structure (41 components)
Remove hierarchical grouping (networking/, security/, etc.) and use flat
structure for all 41 platform components.

Changes:
- All components now directly under platform/ (no subfolders)
- AI Hub components moved from meta-platforms/ai-hub/components/ to platform/
- Open Banking components (lago, openmeter) moved to platform/
- meta-platforms/ now only contains README files that reference platform/
- Open Banking custom services remain in meta-platforms/open-banking/services/

Structure:
- platform/ (41 components, flat)
- meta-platforms/ai-hub/ (README only, references platform/)
- meta-platforms/open-banking/ (README + 6 custom services)

All documentation links updated.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 15:19:48 +00:00