Two SME-blocker bugs caught live on otech113 (alice signup gate 5 fails on fresh Sovereign): #952 — bp-newapi 1.4.0 Pod has no imagePullSecrets, so kubelet pulls PRIVATE ghcr.io/openova-io/openova/{newapi-mirror,services-metering-sidecar} anonymously and gets 403 Forbidden. Fix: - Templatize spec.imagePullSecrets on Deployment + channel-seed Job. - Default values.yaml `imagePullSecrets: [{name: ghcr-pull}]`. - Add `newapi` to flux-system/ghcr-pull's reflector reflection-{allowed,auto}-namespaces in cloudinit-control-plane.tftpl so bp-reflector mirrors the source Secret into the namespace automatically on every fresh Sovereign. - Bump bp-newapi 1.4.0 -> 1.4.1, update _template overlay. #953 — services-build.yaml's image-rewrite loop only matched the hardcoded `image: ghcr.io/.../services-<svc>:<sha>` form. 7 of 8 sme-services templates use `image: "{{ ... }}/services-<svc>:{{ .Values.images.smeTag }}"`. Each services-build run bumped only auth.yaml while reporting "update sme service images to ${SHA}", leaving the live Pod on stale bytes (PR #951's #941 fix never reached services-catalog despite the merge + chart bump chain). Fix: - After the hardcoded loop, also bump `images.smeTag` in products/catalyst/chart/values.yaml with a strict regex match (`^ smeTag: "<sha>"$`); refuse to auto-bump if the line shape changes (defends against silent drift if a contributor renames the field). - Mirror the change into the retry-path `rewrite()` function so a reset-to-origin/main retry does not recreate the original bug. Tests: - platform/newapi/chart/tests/imagepullsecrets-render.sh — 4 cases asserting the Deployment and channel-seed Job carry the default ghcr-pull reference, that an empty override suppresses the block, and that custom secret names propagate (Inviolable Principle #4). - tests/integration/services-build-rewrite.sh — 3 cases reproducing the workflow's rewrite logic on a sandboxed copy of the live chart, asserting both auth.yaml's hardcoded line AND values.yaml's smeTag get bumped, that helm-render of the catalyst chart with the bumped values produces all 8 SME-service Deployments at the new SHA, and that an idempotent re-bump to a second SHA also lands cleanly. Refs: #952 #953 (umbrella #915 — alice signup gate 5). Co-authored-by: hatiyildiz <143030955+hatiyildiz@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
213 lines
8.8 KiB
YAML
213 lines
8.8 KiB
YAML
# bp-newapi — Catalyst Application Blueprint, bootstrap-kit slot 80.
|
|
# Multi-tenant LLM marketplace gateway. Ships in backend-only mode: the
|
|
# OpenAI-compatible API at api.<sovereign-fqdn>/v1/* is customer-facing;
|
|
# the upstream's portal UI is disabled at ingress; Catalyst replaces it
|
|
# as the customer surface; NewAPI's admin UI at admin.<sovereign-fqdn>
|
|
# is exposed only to ops staff (Keycloak-gated).
|
|
#
|
|
# This slot enables the SME-tenant turnkey experience (epic #795). The
|
|
# Catalyst signup hook (delivered by unified-rbac in #802 against the
|
|
# contract recorded in ADR-0003) reads the `catalyst-newapi-admin-token`
|
|
# Secret rendered by this chart's ExternalSecret to issue per-user API
|
|
# keys against NewAPI's admin API at `http://newapi.newapi.svc`.
|
|
#
|
|
# Wrapper chart: platform/newapi/chart/
|
|
# Catalyst-curated values: platform/newapi/chart/values.yaml
|
|
# Reconciled by: Flux on the new Sovereign's k3s control plane.
|
|
|
|
---
|
|
apiVersion: v1
|
|
kind: Namespace
|
|
metadata:
|
|
name: newapi
|
|
labels:
|
|
catalyst.openova.io/sovereign: ${SOVEREIGN_FQDN}
|
|
---
|
|
apiVersion: source.toolkit.fluxcd.io/v1beta2
|
|
kind: HelmRepository
|
|
metadata:
|
|
name: bp-newapi
|
|
namespace: flux-system
|
|
spec:
|
|
type: oci
|
|
interval: 15m
|
|
url: oci://ghcr.io/openova-io
|
|
secretRef:
|
|
name: ghcr-pull
|
|
---
|
|
apiVersion: helm.toolkit.fluxcd.io/v2
|
|
kind: HelmRelease
|
|
metadata:
|
|
name: bp-newapi
|
|
namespace: flux-system
|
|
spec:
|
|
interval: 15m
|
|
releaseName: newapi
|
|
targetNamespace: newapi
|
|
# bp-newapi depends on:
|
|
# - bp-openbao(08): the secret backend the chart's ExternalSecret
|
|
# pulls `ADMIN_API_TOKEN` from. Without OpenBao Ready, the
|
|
# ExternalSecret never resolves and the Catalyst signup hook can't
|
|
# reach the NewAPI admin API.
|
|
# - bp-keycloak(09): the OIDC issuer for the ops-staff admin UI at
|
|
# admin.<sovereign-fqdn>. Without Keycloak Ready, the OIDC
|
|
# middleware can't redirect ops-staff requests.
|
|
# - bp-cnpg(16): operator provisions the Postgres cluster for users,
|
|
# credits, channels, and audit log via a Crossplane
|
|
# PostgresqlInstance claim once cnpg is Ready. The DSN is mounted
|
|
# into NewAPI via `database.existingSecret` (operator-set).
|
|
dependsOn:
|
|
- name: bp-openbao
|
|
- name: bp-keycloak
|
|
- name: bp-cnpg
|
|
chart:
|
|
spec:
|
|
chart: bp-newapi
|
|
# 1.4.0 (issue #943, 2026-05-05): auto-provision CNPG-backed
|
|
# Postgres + chart-emitted SESSION_SECRET/CRYPTO_SECRET so a
|
|
# Sovereign install lands a real Pod without operator intervention.
|
|
# Pre-#943 the Deployment silently skipped render whenever
|
|
# database.existingSecret OR credentials.existingSecret was
|
|
# empty (the bootstrap-kit overlay supplies neither), so NewAPI
|
|
# never came up and alice signup gate 5 (LLM) timed out. Both
|
|
# auto-provisions are capability-gated on bp-cnpg's CRD and
|
|
# operator-overridable per Inviolable Principle #4.
|
|
# 1.3.0: defaultChannels.qwenBankDhofar (channel #1 = Qwen3.6 @
|
|
# https://llm-api.omtd.bankdhofar.com) + post-install/post-upgrade
|
|
# `channel-seed` Helm hook Job that idempotently POSTs default
|
|
# channels into NewAPI's admin API. Issue #915 (epic SME tenant
|
|
# integration DoD: alice → OpenClaw → NewAPI → Qwen3.6@BankDhofar
|
|
# end-to-end).
|
|
# 1.2.0: Traefik Middleware gated behind ingress.middleware.enabled.
|
|
# 1.4.1 (issue #952, 2026-05-05): Pod imagePullSecrets templated +
|
|
# default to `[{name: ghcr-pull}]` so kubelet authenticates pulls
|
|
# of the PRIVATE newapi-mirror + metering-sidecar images. Paired
|
|
# with cloud-init adding `newapi` to flux-system/ghcr-pull's
|
|
# reflector auto-namespaces list.
|
|
version: 1.4.1
|
|
sourceRef:
|
|
kind: HelmRepository
|
|
name: bp-newapi
|
|
namespace: flux-system
|
|
# Event-driven install per docs/INVIOLABLE-PRINCIPLES.md #3 (Flux
|
|
# dependsOn is the gate, not Helm timeout). NewAPI itself starts in
|
|
# ~10 s once the Postgres DSN Secret is present; the long pole is
|
|
# waiting for the operator's Crossplane claim to materialise the DB.
|
|
install:
|
|
disableWait: true
|
|
remediation:
|
|
retries: 3
|
|
upgrade:
|
|
disableWait: true
|
|
remediation:
|
|
retries: 3
|
|
# Per-Sovereign overrides — the operator MUST supply at install time:
|
|
# - ingress.host = api.${SOVEREIGN_FQDN}
|
|
# - ingress.adminHost = admin.${SOVEREIGN_FQDN}
|
|
# - auth.adminUI.keycloak.issuer = https://auth.${SOVEREIGN_FQDN}/realms/ops
|
|
# - database.existingSecret = Postgres DSN Secret (from the
|
|
# Crossplane PostgresqlInstance claim)
|
|
# - credentials.existingSecret = SESSION_SECRET + CRYPTO_SECRET
|
|
# (rotated via OpenBao)
|
|
# - catalystIntegration.externalSecret.remoteRef.key
|
|
# = sovereign/${SOVEREIGN_FQDN}/newapi/admin-token
|
|
# - defaultChannels.vllm.enabled = true (first-otech)
|
|
# - defaultChannels.vllm.endpoint
|
|
# + defaultChannels.vllm.attestation.owner
|
|
#
|
|
# Defaults below wire the first-otech provider channel to the same
|
|
# upstream the OpenOva marketing site uses (Qwen via Axon →
|
|
# `llm-api.omtd.bankdhofar.com`, model `qwen3-coder`); the operator
|
|
# overlay overrides any of these by setting them in this HelmRelease's
|
|
# spec.values.
|
|
values:
|
|
sovereignFQDN: ${SOVEREIGN_FQDN}
|
|
ingress:
|
|
host: api.${SOVEREIGN_FQDN}
|
|
adminHost: admin.${SOVEREIGN_FQDN}
|
|
tls:
|
|
enabled: true
|
|
issuer: letsencrypt-prod
|
|
auth:
|
|
adminUI:
|
|
mode: keycloak
|
|
keycloak:
|
|
issuer: https://auth.${SOVEREIGN_FQDN}/realms/ops
|
|
clientId: newapi-admin
|
|
existingSecret: newapi-oidc
|
|
customerAPI:
|
|
keyIssuer: catalyst
|
|
catalystIntegration:
|
|
enabled: true
|
|
existingSecret: catalyst-newapi-admin-token
|
|
externalSecret:
|
|
enabled: true
|
|
refreshInterval: "1h"
|
|
secretStoreRef:
|
|
kind: ClusterSecretStore
|
|
name: vault-region1
|
|
remoteRef:
|
|
# Canonical OpenBao path per docs/INVIOLABLE-PRINCIPLES.md #4.
|
|
# Under the `vault-region1` store's `secret/` mount the full
|
|
# path is `secret/sovereign/<fqdn>/newapi/admin-token`.
|
|
key: sovereign/${SOVEREIGN_FQDN}/newapi/admin-token
|
|
property: ADMIN_API_TOKEN
|
|
# Default channels — chart-side composition (channel #1 first).
|
|
#
|
|
# `qwenBankDhofar` (issue #915) is the canonical first channel:
|
|
# Qwen3.6 hosted at BankDhofar (https://llm-api.omtd.bankdhofar.com,
|
|
# model `qwen3-coder` / alias `qwen3.6`) — the SAME relay the
|
|
# OpenOva marketing site's Axon helmrelease consumes
|
|
# (openova-private/clusters/contabo-mkt/apps/axon/helmrelease.yaml).
|
|
# Disabled in the template so a fresh Sovereign does not silently
|
|
# wire customers to a third-party endpoint; per-Sovereign overlays
|
|
# (clusters/<sovereign>/bootstrap-kit/80-newapi.yaml) enable this
|
|
# block and supply:
|
|
# - defaultChannels.qwenBankDhofar.enabled = true
|
|
# - defaultChannels.qwenBankDhofar.endpoint = https://llm-api.omtd.bankdhofar.com
|
|
# - defaultChannels.qwenBankDhofar.attestation.accountId (legal-team-owned)
|
|
# - defaultChannels.qwenBankDhofar.attestation.contractRef (legal-team-owned)
|
|
# - the Secret `newapi-channel-qwen-bankdhofar` containing the
|
|
# upstream API key under key `API_KEY` (or an ExternalSecret
|
|
# pulling from OpenBao at
|
|
# `sovereign/<sovereign-fqdn>/newapi/channel-qwen-bankdhofar`)
|
|
# - auth.adminUI.masterKeySecret = name of a Secret carrying
|
|
# `MASTER_KEY` (NewAPI bootstrap admin auth) — required for
|
|
# the channel-seed Helm hook Job to POST against the admin API
|
|
# ONCE at install time. Operator may rotate the master key out
|
|
# post-bootstrap; channels persist in Postgres.
|
|
#
|
|
# When the operator flips `qwenBankDhofar.enabled: true`, the
|
|
# chart's post-install/post-upgrade `channel-seed` Job probes
|
|
# NewAPI's admin API (`/api/channel/?keyword=<name>`) and POSTs
|
|
# the channel definition idempotently. Re-runs after upgrades
|
|
# are no-ops once the channel exists.
|
|
#
|
|
# The legacy `vllm` slot (in-cluster vLLM fallback) remains for
|
|
# operators that run their own bp-vllm + open-weight model in-
|
|
# cluster; it composes after `qwenBankDhofar` and any operator
|
|
# `.Values.channels`.
|
|
defaultChannels:
|
|
qwenBankDhofar:
|
|
enabled: false
|
|
name: qwen3.6-bankdhofar
|
|
endpoint: ""
|
|
models:
|
|
- qwen3.6
|
|
- qwen3-coder
|
|
existingSecret: newapi-channel-qwen-bankdhofar
|
|
existingSecretKey: API_KEY
|
|
attestation:
|
|
kind: commercial-contract
|
|
accountId: ""
|
|
contractRef: ""
|
|
vllm:
|
|
enabled: false
|
|
name: qwen
|
|
endpoint: ""
|
|
models:
|
|
- qwen3-coder
|
|
attestation:
|
|
kind: in-cluster
|
|
owner: ${SOVEREIGN_FQDN}
|