prov #20: bp-newapi 1.4.2 HR FAILED with the chart's templates/external-secret.yaml apply rejected by the apiserver: Internal error occurred: failed calling webhook "validate.externalsecret.external-secrets.io": ... no endpoints available for service "external-secrets-webhook" bp-external-secrets reaches HR Ready=True the moment its Deployments report Ready, but Pod Ready != webhook EndpointSlice reachable: the apiserver-side EndpointSlice for the webhook Service has not been observed by the validating admission controller's lookup yet. Flux dependsOn satisfies the dependency graph but does NOT close this race. Same root-cause class as Fix #137 (bp-external-secrets-stores) but a DIFFERENT chart and DIFFERENT validation endpoint (ExternalSecret vs ClusterSecretStore). Canonical seam (Inviolable Principle #16): the chart that CONSUMES the webhook owns the readiness gate. NOT the upstream external-secrets chart (Fix #137 territory) and NOT a Flux HR-level dependsOn (which checks the wrong layer). Adds platform/newapi/chart/templates/000-external-secrets-webhook- readiness-job.yaml — a pre-install/pre-upgrade Helm hook that polls the webhook (default external-secrets-webhook.external-secrets-system.svc:443/validate- external-secrets-io-v1beta1-externalsecret) until it returns a structured HTTP response (200/400/405/415/422). 60s wall budget, 2s interval, no RBAC required (curl-only Pod, HTTPS to ClusterIP). Templated end-to-end via .Values.externalSecretsWebhookGate.* per Inviolable Principle #4 — operator may override service, namespace, port, path, timeout, interval, or disable the gate entirely from a per-Sovereign overlay. Capability-gated on the external-secrets.io/v1beta1 CRD AND on the existing catalystIntegration.externalSecret.enabled chain, so a Sovereign that disables catalyst-integration pays no probe overhead. Chart 1.4.2 -> 1.4.4 (1.4.3 was a deploy-only image-tag bump). HR template clusters/_template/bootstrap-kit/80-newapi.yaml repinned. ## Claimed TCs Infra-only fix; no UI behaviour change. Unblocks bp-newapi reaching HR Ready=True on every fresh provision, which is a hard prerequisite for: - ADR-0003 §3.2 Catalyst signup hook (alice -> per-user NewAPI key) - alice signup gate 5 (LLM) end-to-end - Any TC that exercises /v1/* customer API or admin.<sovereign-fqdn> Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
163 lines
8.4 KiB
YAML
163 lines
8.4 KiB
YAML
apiVersion: v2
|
|
name: bp-newapi
|
|
# 1.4.4 (qa-loop bounded-cycle audit prov #20 Fix #138, 2026-05-11): add
|
|
# pre-install/pre-upgrade hook (templates/000-external-secrets-webhook-
|
|
# readiness-job.yaml) that polls the external-secrets validating-
|
|
# admission webhook (`external-secrets-webhook.external-secrets-system.
|
|
# svc:443/validate-external-secrets-io-v1beta1-externalsecret`) until
|
|
# it returns a structured HTTP response (200/400/405/415/422). Closes
|
|
# the race between bp-external-secrets reaching HR Ready=True (Pods
|
|
# Ready) and the apiserver-side EndpointSlice for the webhook Service
|
|
# being observable by the validating admission controller.
|
|
#
|
|
# ROOT CAUSE (prov #20):
|
|
# bp-newapi 1.4.2 HR FAILED with the chart's templates/external-
|
|
# secret.yaml apply rejected by the apiserver:
|
|
# Internal error occurred: failed calling webhook
|
|
# "validate.externalsecret.external-secrets.io": ...
|
|
# no endpoints available for service "external-secrets-webhook"
|
|
# bp-external-secrets satisfies Flux dependsOn the moment its
|
|
# Deployments report Ready, but Pod Ready ≠ webhook EndpointSlice
|
|
# reachable. The chart immediately tried to apply ExternalSecret and
|
|
# the webhook returned 503/connect-error.
|
|
#
|
|
# CANONICAL SEAM (Inviolable Principle #16):
|
|
# The chart that CONSUMES the webhook owns the readiness gate — NOT
|
|
# the upstream external-secrets chart (owned by Fix #137 territory)
|
|
# and NOT a Flux HR-level dependsOn (which checks the wrong layer).
|
|
# Fix #138 mirrors Fix #137's pattern but probes a different
|
|
# validation endpoint (ExternalSecret vs ClusterSecretStore).
|
|
#
|
|
# TEMPLATABILITY (Inviolable Principle #4):
|
|
# Every knob (webhook service name, namespace, port, path, timeout,
|
|
# interval, gate-enabled flag) is operator-tunable from per-
|
|
# Sovereign overlays via .Values.externalSecretsWebhookGate.*.
|
|
#
|
|
# 1.4.3: deploy-only bump (a9861f94 — no semantic change beyond
|
|
# carrying the v0.13.2 image-tag fix to the bootstrap-kit).
|
|
#
|
|
# 1.4.2 (qa-loop bounded-cycle audit prov #7 Gap F, 2026-05-10): point
|
|
# `.Values.newapi.image.tag` at a tag that ACTUALLY EXISTS in GHCR. Pre-
|
|
# 1.4.2 the chart referenced `ghcr.io/openova-io/openova/newapi-mirror:
|
|
# v0.4.5` — a tag that was never built by any CI workflow and never
|
|
# pushed to GHCR (the package itself didn't exist; the original commit
|
|
# 44d0200a invented the tag with no matching CI build). Every fresh
|
|
# Sovereign's NewAPI Pod ImagePullBackOff'd, blocking alice signup
|
|
# gate 5 (LLM).
|
|
#
|
|
# Fix:
|
|
# - NEW .github/workflows/build-bp-newapi.yaml — mirrors
|
|
# `docker.io/calciumion/new-api:<UPSTREAM_VER>` to
|
|
# `ghcr.io/openova-io/openova/newapi-mirror:<UPSTREAM_VER>` on every
|
|
# push to platform/newapi/chart/**. Mirrors the bp-guacamole CI
|
|
# pattern: capture upstream repo digest, re-tag into GHCR, bump
|
|
# values.yaml + Chart.yaml + dispatch blueprint-release.
|
|
# - values.yaml — bump `newapi.image.tag` from `v0.4.5` (fictitious)
|
|
# to `v0.13.2` (latest stable Calcium-Ion/new-api on Docker Hub
|
|
# at the authoring date, 2026-04-27 upstream publish). The
|
|
# v1.0.0-rc.x line is gated on upstream stabilising the schema
|
|
# migration; the channel-seed Job uses the legacy admin-API request
|
|
# shape, so do NOT auto-roll past v0.13.x without re-running the
|
|
# channel-seed integration smoke.
|
|
# - appVersion bumped from `0.4.5` to `0.13.2` to match the mirrored
|
|
# upstream (Helm convention: appVersion = upstream version without
|
|
# the `v` prefix; consumers reading `helm list` see the actual
|
|
# NewAPI release running).
|
|
#
|
|
# 1.4.1 (issue #952, 2026-05-05): templatize spec.imagePullSecrets on the
|
|
# Deployment + channel-seed Job and default `imagePullSecrets:
|
|
# [{name: ghcr-pull}]` in values.yaml. Pre-1.4.1 the Pod template emitted
|
|
# no imagePullSecrets, so kubelet pulled the PRIVATE
|
|
# `ghcr.io/openova-io/openova/newapi-mirror` and
|
|
# `ghcr.io/openova-io/openova/services-metering-sidecar` images
|
|
# anonymously and got 403 Forbidden on every fresh Sovereign — blocking
|
|
# alice signup gate 5 (LLM). The cloud-init `flux-system/ghcr-pull`
|
|
# Secret is reflected into the `newapi` namespace via bp-reflector
|
|
# annotations (paired with #952 cloud-init update adding `newapi` to the
|
|
# `reflection-auto-namespaces` list).
|
|
#
|
|
# 1.4.0 (issue #943, 2026-05-05): auto-provision CNPG-backed Postgres +
|
|
# in-chart credentials Secret so a Sovereign install at bootstrap-kit
|
|
# slot 80 lands a real Pod without operator intervention.
|
|
#
|
|
# Pre-#943 the deployment.yaml gate REQUIRED operator-supplied
|
|
# `database.existingSecret` AND `credentials.existingSecret` — the Pod
|
|
# silently skipped render whenever either was empty. On a freshly
|
|
# franchised Sovereign nothing populated those values, so NewAPI never
|
|
# came up and alice signup gate 5 (LLM) timed out waiting for the
|
|
# customer-facing /v1/* API to respond.
|
|
#
|
|
# Fix:
|
|
# - NEW templates/cnpg-cluster.yaml — when .Values.cnpg.enabled
|
|
# (DEFAULT true), renders postgresql.cnpg.io/v1.Cluster
|
|
# `<release>-newapi-pg` + a Helm-`lookup`-persistent DSN Secret
|
|
# `<release>-newapi-db-dsn` (key: SQL_DSN) read by the deployment.
|
|
# Capabilities-gated on postgresql.cnpg.io/v1 so a cold install
|
|
# before bp-cnpg is Ready surfaces as "no Cluster yet" rather
|
|
# than a hard install error (mirrors platform/powerdns/chart/
|
|
# templates/cnpg-cluster.yaml's pattern).
|
|
# - NEW templates/credentials-secret.yaml — when
|
|
# .Values.credentials.autoProvision (DEFAULT true) AND
|
|
# .Values.credentials.existingSecret is empty, renders
|
|
# `<release>-app-creds` carrying SESSION_SECRET + CRYPTO_SECRET
|
|
# (each 64-char randAlphaNum, persistent across reconciles via
|
|
# Helm `lookup`). helm.sh/resource-policy: keep so a re-install
|
|
# recovers the same bytes (otherwise an upgrade silently rotates
|
|
# the keys, invalidating every active session cookie + every
|
|
# encrypted token).
|
|
# - deployment.yaml — Secret name resolution now picks the chart-
|
|
# emitted defaults when the operator hasn't supplied an override.
|
|
# The Pod-render gate accepts EITHER an explicit existingSecret
|
|
# OR the auto-provisioned one.
|
|
# - values.yaml — adds .Values.cnpg.* and .Values.credentials.
|
|
# {autoProvision,autoSecretName} blocks; every knob is operator-
|
|
# overridable per Inviolable Principle #4.
|
|
#
|
|
# 1.3.0: defaultChannels.qwenBankDhofar (channel #1 = Qwen3.6 @
|
|
# https://llm-api.omtd.bankdhofar.com) + post-install/post-upgrade
|
|
# `channel-seed` Helm hook Job that idempotently POSTs default
|
|
# channels into NewAPI's admin API at /api/channel/. Bridges the
|
|
# previous documentation-only channels.yaml ConfigMap → the actual
|
|
# Postgres-backed channel rows the upstream binary reads at runtime.
|
|
# Issue #915 (epic SME tenant integration DoD: alice → OpenClaw →
|
|
# NewAPI → Qwen3.6@BankDhofar end-to-end).
|
|
# 1.2.0: Traefik Middleware gated behind ingress.middleware.enabled.
|
|
version: 1.4.4
|
|
appVersion: "0.13.2"
|
|
description: |
|
|
Catalyst Blueprint scratch chart for NewAPI — multi-tenant LLM
|
|
marketplace gateway.
|
|
|
|
No first-party Helm chart is published by the NewAPI upstream
|
|
(https://github.com/Calcium-Ion/new-api ships a docker-compose only).
|
|
This chart hand-wires the upstream container image as a
|
|
Deployment + Service + Ingress (split customer-API vs ops-admin) +
|
|
ConfigMap + ServiceAccount + NetworkPolicy + ExternalSecret consumers.
|
|
|
|
Connects to:
|
|
- bp-cnpg (Postgres for users, credits, channels, audit log)
|
|
- bp-valkey (session and rate-limit cache)
|
|
- bp-keycloak (OIDC for ops-staff admin UI)
|
|
- bp-external-secrets (DB DSN, master key, per-channel provider keys)
|
|
- bp-vllm (in-cluster cheap-tier inference channel; optional)
|
|
|
|
Customer surface is NOT this chart — Catalyst is the customer-facing
|
|
UI. This chart's customer surface is the OpenAI-compatible API at
|
|
`api.<host>/v1/*`. The upstream's portal UI is intentionally disabled
|
|
via the ingress configuration.
|
|
|
|
Pairs with bp-cnpg, bp-keycloak, bp-external-secrets, bp-valkey,
|
|
bp-vllm (depends.list in blueprint.yaml).
|
|
type: application
|
|
keywords: [catalyst, blueprint, newapi, llm, gateway, marketplace, multi-tenant, credits, byok, openai-compatible, reseller]
|
|
maintainers:
|
|
- name: OpenOva Catalyst
|
|
email: catalyst@openova.io
|
|
# Scratch chart — see comments in bp-librechat/chart/Chart.yaml for the
|
|
# rationale on the `common` library subchart dependency (issue #181
|
|
# hollow-chart gate).
|
|
dependencies:
|
|
- name: common
|
|
version: "0.1.3"
|
|
repository: "https://sigstore.github.io/helm-charts"
|