95a06f56f8
33 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
95a06f56f8
|
fix(sme-marketplace): unblock PIN signin — route /api/* to sme/gateway + add send-pin alias (#868) (#869)
Two-part fix for marketplace UI signin flow which 503'd then 404'd on
otech103. Live debugging found two stacked bugs.
Part A — chart (HTTPRoute backend):
- marketplace-routes.yaml: /api/* rule now backendRefs sme/gateway:8080
(cross-namespace) instead of catalyst-system/marketplace-api which had
a Service selector matching zero Pods. The gateway in sme already
fronts services-auth, catalog, tenant, billing, provisioning.
- marketplace-reference-grant.yaml: extend `to:` list with the gateway
Service so the cross-ns hop is authorised by Gateway API.
- Bump bp-catalyst-platform 1.4.7 → 1.4.8 + lockstep slot 13 pin.
Part B — services-auth (route name):
- Add /auth/send-pin alias delegating to existing SendMagicLink handler,
and /auth/verify-pin alias delegating to VerifyMagicLink. The
marketplace UI surfaces a 6-digit PIN ("Send PIN" button), so the
PIN-named routes are the canonical UX-facing names. /auth/magic-link
and /auth/verify remain registered for backward compat.
- services-build workflow auto-rebuilds the auth image on push to
core/services/** — no manual dispatch needed.
Refs: #868
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
|
||
|
|
fa4395fa3a
|
fix(bp-catalyst-platform): wire VALKEY_PASSWORD into SME auth + gateway (#863) (#864)
After PR #862 (1.4.4) made cross-ns Valkey reachable from `sme` ns, the auth Pod started CrashLoopBackOff with "NOAUTH HELLO must be called with the client already authenticated". Root cause: bp-valkey 1.0.0 ships auth.enabled=true (bitnami default) but SME service code + Deployment templates never plumbed a password through. Chart 1.4.4 -> 1.4.5. Slot 13 pin lockstep. Changes: - core/services/shared/db/valkey.go: add ConnectValkeyWithAuth overload taking username + password. ConnectValkey kept backwards-compatible for contabo-mkt's auth-less in-namespace Valkey. - core/services/auth/main.go + gateway/main.go: read VALKEY_USERNAME + VALKEY_PASSWORD env, call ConnectValkeyWithAuth when password set, else fall through to no-auth path. - NEW templates/sme-services/valkey-cross-ns-secret.yaml: Helm `lookup` reads bp-valkey's auto-generated `valkey-password` from the `valkey/valkey` Secret and re-emits it as `sme-valkey-auth` in `sme` ns. Same pattern as sme-secrets.yaml (#859) and gitea-admin-secret (#830 Bug 2). On first install the lookup may return nil; Flux's 15m reconcile picks up the mirror once bp-valkey is Ready. - auth.yaml + gateway.yaml: add VALKEY_PASSWORD env from `sme-valkey- auth` Secret with optional=true so contabo-mkt's auth-less path keeps working when the mirror Secret is absent. - values.yaml: add `smeServices.valkey.{sourceSecretName, sourcePasswordKey, destNamespace, destSecretName}` knobs (Inviolable Principle #4). Live verified the failure mode on otech103: 11/13 SME pods Running 1/1, auth in CrashLoopBackOff with NOAUTH HELLO error. Provisioning Pod's CreateContainerConfigError is unrelated (ghcr-pull, separate ticket). Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5cdb738ac9
|
fix(services): go mod tidy across sibling services after #798 shared deps bump (#821)
#798 added github.com/nats-io/nats.go to core/services/shared/go.mod and adjusted x/sys/x/crypto/x/text to Go 1.22-compatible versions. The sibling services (auth, catalog, domain, gateway, notification, provisioning, tenant) reference the same shared module via the local `replace` directive — their go.sum files must include the new transitive hashes, otherwise the CI Containerfile build hits: go: updates to go.mod needed; to update it: go mod tidy This commit is a pure `go mod tidy` across all 7 services; no source changes. CI services-build is now unblocked. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9645a9044a
|
feat(metering): NewAPI NATS publisher + sme-billing subscriber + POST /metering/record (#798) (#818)
* feat(metering): NewAPI NATS publisher + sme-billing subscriber + POST /metering/record (#798) Per #795 [Q-mine-3] (NATS not RedPanda) + [Q-mine-4] (one ledger), add the SME-2 metering integration end-to-end. NewAPI is consumed as the upstream image `ghcr.io/openova-io/openova/newapi-mirror` (a pinned mirror, not a fork) — the metering envelope is produced by a Go sidecar that observes the OpenAI-style `usage.total_tokens` field on every 2xx /v1/* response. This avoids forking the upstream binary while still producing the canonical envelope shape on `catalyst.usage.recorded`. A) NewAPI metering sidecar — core/services/metering-sidecar/ - Transparent reverse proxy in front of NewAPI on its own port; the bp-newapi Service routes the cluster-fronting port to the sidecar, which forwards to NewAPI on the pod's loopback. - Observes successful /v1/* JSON responses, parses `usage.{prompt_tokens,completion_tokens,total_tokens}`, computes amount_micro_omr = -tokens * priceMicroOMRPerToken, and publishes one envelope on `catalyst.usage.recorded` per completed request. - Failed (non-2xx), non-JSON, and admin-path requests are NOT billed. - Customer-facing latency is NEVER blocked on metering: the response body is restored before publish; on NATS unreachable the envelope is persisted to disk and retried by a background drain loop. - 14 unit tests (proxy + publisher + safeFilename guards). B) sme-billing NATS subscriber — core/services/billing/handlers/ metering_consumer.go - JetStream durable consumer `sme-billing-metering` on stream `CATALYST_USAGE` (provisioned by sme-billing on startup). - Idempotent on metadata.request_id via a UNIQUE partial index on credit_ledger.external_ref; redelivery from the broker collapses to a single ledger row. - Customer auto-create on cold start (the rbac sme.user.created envelope may land AFTER the first metered request; we don't strand usage waiting for it). - 11 unit tests covering happy-path, idempotency, malformed-payload poison-pill, missing-request-id, non-negative amount guard, resolver error → Nak, derive-micro-OMR-from-OMR, DB-error → Nak. C) HTTP handler POST /billing/metering/record — handlers/metering.go - Synchronous validate → INSERT credit_ledger → return {ledger_entry_id, balance_after_omr, balance_after_micro_omr, duplicate}. Same payload + idempotency guard as the NATS path. - Auth: superadmin OR sovereign-admin (operator-admin model; end-user LLM traffic flows through the sidecar, never this URL). - 8 unit tests covering happy-path, idempotency, role gating, malformed-JSON, positive-amount rejection, customer-not-found. D) Schema — core/services/billing/store/store.go - ALTER TABLE credit_ledger ADD COLUMN amount_micro_omr BIGINT (1 OMR = 1,000,000 micro-OMR; -0.000234 OMR = -234 micro-OMR exact integer — preserves precision at metering rates). - ADD COLUMN external_ref TEXT + UNIQUE partial index for idempotency dedup. - ADD COLUMN metadata JSONB for the raw envelope. - GetCreditBalance projects both amount_omr (legacy) and amount_micro_omr (new) into the integer-OMR view. - GetCreditBalanceMicroOMR returns canonical precision. - RecordUsage method: ON CONFLICT DO UPDATE … RETURNING (xmax<>0) distinguishes fresh insert from duplicate without a follow-up SELECT. E) Wiring - core/services/shared/events/nats.go — minimal NATS JetStream publisher + subscriber surface; legacy RedPanda producer/consumer in events.go untouched per [Q-mine-3]. - core/services/billing/main.go — NATS_URL env; subscriber wired in parallel with the existing RedPanda tenant-events consumer. - middleware/jwt.go — exported test helper WithClaims so handler tests can construct an authenticated context without minting a real signed token. - .github/workflows/services-build.yaml — metering-sidecar added to the build matrix; deploy job skips it (image consumed by the bp-newapi chart, not products/catalyst sme-services). F) bp-newapi chart (1.0.0 → 1.1.0) - meteringSidecar block in values.yaml: image, port, NATS URL, priceMicroOMRPerToken (default 156 = 0.000156 OMR/token), spool dir, header names, resources, securityContext (read-only-rootfs). - deployment.yaml renders the sidecar container + emptyDir spool volume when meteringSidecar.enabled (default true). - service.yaml routes the cluster-fronting :3000 to the sidecar when enabled, exposes a separate :3001 → NewAPI direct port for bp-catalyst-platform admin-API traffic (ADR-0003 §3.2). - networkpolicy.yaml allows the sidecar's port + nats-system egress for JetStream publish. Tests: 33 new (14 sidecar + 11 subscriber + 8 HTTP handler), all green. Helm template renders cleanly with sidecar enabled and disabled. Closes #798 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(billing/store): cast SUM to BIGINT so lib/pq scans into int64 (#798) Postgres returns `SUM(int) + SUM(bigint)/integer` as `numeric`, which lib/pq presents as a `[]uint8` decimal string ("50.000000000000000000000000") that does NOT scan directly into Go int64 — the integration test TestVoucherLifecycle_IssueRedeemAndCreditApplied caught this in CI on the post-redeem balance read. Wrap the SUM expressions in CAST(... AS BIGINT) so the column type is unambiguously bigint and Scan target stays uniform across pre-#798 rows (amount_omr only) and post-#798 rows (amount_micro_omr present). Affects: - GetCreditBalance - GetCreditBalanceMicroOMR - RecordUsage's running-balance read Test mocks updated to match the new SQL prefix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
2a034a0959
|
feat(catalog): unified catalog with Published flag — operator curates marketplace (#710 wave 2) (#724)
Single source of truth for apps; Sovereign-console operator decides which
apps marketplace customers see; marketplace storefront filters by
Published. Per founder rule 2026-05-04: unpublish is a marketplace-
visibility toggle, not a deployment-lifecycle action — existing tenant
deployments of an unpublished app keep running unaffected.
core/services/catalog/store/store.go
====================================
- App.Published bool — operator-controlled visibility
- ListPublishedApps: marketplace-storefront subset
(Published=true AND System=false AND Deployable=true).
System and Deployable are catalog-team-controlled; Published is the
operator's curation knob.
- SetAppPublished(slug, bool) — hot-path one-bit write the Sovereign
console hits per row toggle. Cheaper than UpdateApp; slug-keyed so
the UI doesn't need the internal Mongo _id.
- UpdateApp: thread published through full-update path too.
core/services/catalog/handlers/handlers.go + routes.go
======================================================
- ListApps now honours ?published=true query param:
GET /catalog/apps → operator view: every app
GET /catalog/apps?published=true → marketplace view: filtered
- New PATCH /catalog/admin/apps/{slug}/publish?value={true|false}
for the Sovereign-console operator's row toggle.
- requireAdmin gating preserved on the admin endpoint.
core/services/catalog/handlers/seed.go
======================================
- migrateAppPublished: defaults Published=true on every existing app
on the day Catalyst 1.3.x ships. Operators opt OUT of marketplace
visibility per app, not IN — matches how a real SaaS storefront is
curated and prevents an empty marketplace on flag-introduction day.
Idempotent on re-run.
core/marketplace/src/lib/api.ts
================================
- getApps() now hits /catalog/apps?published=true so the marketplace
storefront only renders the operator-curated subset.
DoD pending wave 2.5
====================
The Sovereign-console "Catalog & publishing" admin page (per-row
toggle UI) is the next chunk and ships in a follow-up — backend +
storefront filter are the load-bearing change here. Catalog admins
can flip the flag today via the PATCH endpoint; the per-row UI is
quality-of-life on top.
Co-authored-by: hatiyildiz <hatiyildiz@openova.io>
|
||
|
|
73d68d99c1
|
fix(auth-ux): HTML PIN email + copyable email pill + 6-box marketplace PIN + drop UI debris (#721) (#723)
Wave 1 of #721 — what the founder actually saw on console.openova.io and marketplace.openova.io / marketplace.<sov>. PIN email rewrite (catalyst-api auth.go) ======================================== Was: plaintext "Your OpenOva sign-in code:\n\n 9 6 5 1 2 8\n…" Now: multipart/alternative MIME with a polished HTML alternative — white card on neutral background, OpenOva mark + wordmark, "Your sign-in code" heading, big tinted code block (34px monospaced, 10px letter-spacing, one-tap copy on iOS Mail), expiration + ignore notice, footer credit. Inline styles only — Gmail/Outlook web strip <style>. Card pinned at 480px so narrow webmail panes render correctly. text/plain fallback kept for clients without HTML. Catalyst-Zero verify page (VerifyPinPage.tsx) ============================================= - Email shown as a copyable PILL with copy icon — click copies to clipboard, icon flips to a check for 1.5s. Selection-fallback for browsers without clipboard API. - Centered title + subtitle (was left-aligned in 1.2.x). - Microcopy: "Codes expire after 10 minutes — check your spam folder." Marketplace checkout sign-in (CheckoutStep.svelte) ================================================== - 1 single <input maxlength=6> → 6 separate <input maxlength=1> boxes with auto-advance, paste-fan-out (paste a 6-digit code anywhere on the row, all 6 boxes fill, autosubmits), backspace-back, ArrowLeft/ Right navigation, autocomplete=one-time-code on first box for iOS SMS autofill, caret-transparent so the digit IS the caret. - Email shown as the same copyable pill pattern (svg copy/check icons, hover-to-brand affordance). - Dropped "Use a different email" link (browser back works). - Added expire/spam microcopy below button. Header + wayfinding cleanup =========================== - Header.svelte: top-right "Sign in" button hidden when pathname is /checkout or /login. Two sign-in CTAs on the same screen was the UI debris caught live 2026-05-04. - CheckoutStep.svelte: "← Back to Review" moved from bottom-left (where users don't look) to top-left above the Checkout heading, rendered with a chevron icon. Co-authored-by: hatiyildiz <hatiyildiz@openova.io> |
||
|
|
4946ccd125
|
feat(bp-catalyst-platform): expose marketplace + tenant wildcard, bump 1.3.0 (closes #710) (#719)
Marketplace exposure for franchised Sovereigns. Otech becomes a SaaS
operator with a single overlay toggle.
Changes
=======
products/catalyst/chart:
- Chart.yaml 1.2.7 → 1.3.0
- values.yaml: ingress.marketplace.enabled toggle (default false) +
marketplace.{brand,currency,paymentProvider,signupPolicy} surface
- templates/sme-services/marketplace-routes.yaml: HTTPRoute
marketplace.<sov> with /api/ → marketplace-api, /back-office/ → admin,
/ → marketplace; HTTPRoute *.<sov> → console (per-tenant wildcard)
- templates/sme-services/marketplace-reference-grant.yaml: cross-
namespace ReferenceGrant from catalyst-system HTTPRoute → sme Services
- .helmignore: stop excluding sme-services/* and marketplace-api/* (only
*.kustomization.yaml + *.ingress.yaml remain Kustomize-only)
- All sme-services/* + marketplace-api/* manifests wrapped with
{{ if .Values.ingress.marketplace.enabled }} so non-marketplace
Sovereigns render the chart unchanged
clusters/_template/bootstrap-kit/13-bp-catalyst-platform.yaml:
- chart version 1.2.7 → 1.3.0
- ingress.hosts.marketplace.host: marketplace.${SOVEREIGN_FQDN}
- ingress.marketplace.enabled: ${MARKETPLACE_ENABLED:-false}
infra/hetzner:
- variables.tf: marketplace_enabled var (string "true"/"false", default "false")
- main.tf: thread var into cloudinit-control-plane.tftpl
- cloudinit-control-plane.tftpl: postBuild.substitute.MARKETPLACE_ENABLED
on bootstrap-kit, sovereign-tls, infrastructure-config Kustomizations
products/catalyst/bootstrap/api/internal/provisioner/provisioner.go:
- Request.MarketplaceEnabled bool (json:"marketplaceEnabled")
- writeTfvars: marketplace_enabled = "true"|"false"
core/pool-domain-manager/internal/allocator/allocator.go:
- canonicalRecordSet adds "marketplace" prefix → marketplace.<sov>
resolves via PDM at zone-commit time (PR #710 explicit record so
caches don't depend on the *.<sov> wildcard alone)
DoD ready
=========
- helm template with ingress.marketplace.enabled=false → identical
manifest set to 1.2.7 (verified locally)
- helm template with ingress.marketplace.enabled=true → emits 17 extra
resources: 13 sme-services workloads + 2 marketplace-api + 1
HTTPRoute pair + 1 ReferenceGrant
- pdm tests: TestCanonicalRecordSet, TestCommitDNSShape green
- catalyst-api builds, provisioner cloudinit_path_test green
Co-authored-by: hatiyildiz <hatiyildiz@openova.io>
|
||
|
|
174ca02aba
|
feat(marketplace): omantel.openova.io vanity host with light-theme partner branding (#633)
Adds a tenant-aware branding layer to the marketplace so the same pods can serve marketplace.openova.io (default OpenOva, dark) and omantel.openova.io (Omantel logo, forced light theme) — no extra deployments, no extra resources. Tomorrow's Omantel demo lands on omantel.openova.io and gets the partner look without disturbing the existing marketplace.openova.io experience. Changes - src/lib/tenant.ts: hostname → tenant config (logo, brand, force theme, skip-console-redirect). Easy to extend with future partner hosts. - src/layouts/Layout.astro: pre-hydration script sets <html data-tenant> and forces light theme for omantel before paint (zero flash). Returning- user redirect to console.openova.io/nova is suppressed for tenants with skipConsoleRedirect=true so the demo stays on the partner host. - src/components/Header.svelte: renders both brand spans; CSS in global.css hides the inactive one based on html[data-tenant]. SSR'd HTML stays cacheable across hostnames. - public/logos/omantel.svg: official Omantel wordmark (Wikimedia source, brand colours #283d90 navy + #e27739 orange). Ingress + chart fixes - products/catalyst/chart/templates/sme-services/ingress.yaml: adds two ingresses (omantel /api/ priority 200, omantel / priority 100) pointing at the existing gateway/marketplace services. cert-manager issues omantel-tls via letsencrypt-prod (DNS already resolves via the *.openova.io wildcard A record). - products/catalyst/chart/templates/sme-services/marketplace.yaml: this path is Kustomize-applied (contabo-mkt only — Sovereigns skip via .helmignore), so the image must be a concrete string. PR #580 templated it with Helm syntax which produced InvalidImageName on the new ReplicaSet — rolling forward stalled. De-templatized and pinned to the current deployed SHA so the marketplace-build CI sed can update it. Backwards compatibility - marketplace.openova.io: identical render — default tenant 'openova', inline OpenOva SVG, dark theme by default, console redirect intact. - Other hosts (console.openova.io, admin.openova.io): untouched. Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5a403e66b1
|
fix(tls): DNS-01 wildcard TLS chain — solverName pdns, NodePort 30053, dynadot test fix (#582)
* fix(bp-harbor): CNPG database must be 'registry' not 'harbor' — matches coreDatabase
Harbor upstream always connects to a database named 'registry'
(harbor.database.external.coreDatabase default). The CNPG Cluster was
initialised with database='harbor', causing:
FATAL: database "registry" does not exist (SQLSTATE 3D000)
Fix: change postgres.cluster.database default from 'harbor' → 'registry'
in values.yaml and cnpg-cluster.yaml template. Both the CNPG bootstrap
and Harbor's coreDatabase now use 'registry'.
Runtime fix on otech22: CREATE DATABASE registry OWNER harbor was run
against harbor-pg-1. harbor-core is now 1/1 Running.
Bump bp-harbor 1.2.1 → 1.2.2. Bootstrap-kit refs updated.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(tls): DNS-01 wildcard TLS chain — solverName, NodePort 30053, dynadot test fix
Five independent fixes that together complete the DNS-01 wildcard TLS chain
for per-Sovereign certificate autonomy:
1. cert-manager-powerdns-webhook solverName mismatch (root cause of #550 echo):
- values.yaml: `webhook.solverName: powerdns` → `pdns`
- The zachomedia binary's Name() returns "pdns" (hardcoded). cert-manager
calls POST /apis/<groupName>/v1alpha1/<solverName>; when solverName is
"powerdns" cert-manager gets 404 → "server could not find the resource".
2. cert-manager-dynadot-webhook solver_test.go mock format:
- writeOK() and error injection used old ResponseHeader-wrapped format
- Real api3.json returns ResponseCode/Status directly in SetDnsResponse
- This caused the image build to fail at
|
||
|
|
7c3ff940ff |
fix(ci): update solver_test.go fixtures + expected-bootstrap-deps.yaml for #550
- core/cmd/cert-manager-dynadot-webhook/solver_test.go: fix SetDns2Response → SetDnsResponse and ResponseCode:"0" → ResponseCode:0 in test fixtures so webhook command tests pass against the corrected dynadot-client JSON parsing - scripts/expected-bootstrap-deps.yaml: declare bp-cert-manager-dynadot-webhook at slot 49b so the bootstrap-kit dependency-graph audit passes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
ccc38987c2
|
fix(tls): bp-cert-manager-dynadot-webhook slot 49b + DNS-01 JSON bug (Closes #550) (#558)
Root cause: bootstrap-kit installs bp-cert-manager-powerdns-webhook (slot 49)
but the letsencrypt-dns01-prod ClusterIssuer wires to the dynadot webhook
(groupName: acme.dynadot.openova.io). Without slot 49b the APIService for
acme.dynadot.openova.io does not exist → cert-manager gets "forbidden" on
every ChallengeRequest → sovereign-wildcard-tls stays in Issuing indefinitely
→ HTTPS gateway has no cert → SSL_ERROR_SYSCALL on the handover URL.
Changes:
- core/pkg/dynadot-client: fix SetDnsResponse JSON key (was SetDns2Response,
API returns SetDnsResponse); change ResponseCode to json.Number (API returns
integer 0, not string "0"); update tests to match real API response format
- platform/cert-manager-dynadot-webhook/chart:
- rbac.yaml: add domain-solver ClusterRole + ClusterRoleBinding so
cert-manager SA can CREATE on acme.dynadot.openova.io (the "forbidden" fix)
- values.yaml: add certManager.{namespace,serviceAccountName}, clusterIssuer.*
and privateKeySecretRefName; add rbac.create comment for domain-solver
- certificate.yaml: trunc 64 on commonName (was 76 bytes, cert-manager rejects >64)
- clusterissuer.yaml: new template (skip-render default, enabled via overlay)
- deployment.yaml: add imagePullSecrets support (required for private GHCR)
- Chart.yaml: bump to 1.1.0
- clusters/_template/bootstrap-kit:
- 49b-bp-cert-manager-dynadot-webhook.yaml: new slot (PRE-handover issuer)
- kustomization.yaml: add 49b entry
- infra/hetzner:
- variables.tf: add dynadot_managed_domains variable
- main.tf: pass dynadot_{key,secret,managed_domains} to cloud-init template
- cloudinit-control-plane.tftpl: write cert-manager/dynadot-api-credentials
Secret + apply it before Flux reconciles bootstrap-kit
Co-authored-by: hatiyildiz <hatiyildiz@openova.io>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
|
||
|
|
3a34969a2f
|
feat(catalyst+pdm): Sovereign self-decommission + post-handover redirect (closes #319) (#451)
Customer-side decommission UI + PDM release endpoints + Catalyst-Zero
redirect to console.<sovereign-fqdn> once handover is finalised.
Anti-duplication map (canonical seams reused, NOT duplicated):
- catalyst-api wipe.go: existing wipe endpoint already drives PDM
release + Hetzner purge + tofu destroy + local cleanup. The new
DecommissionPage POSTs to the same endpoint with an optional
backup-destination payload.
- PDM Allocator.Release: child zone delete + parent-zone NS revert
+ allocation row delete already idempotent. The new sovereign-side
POST /api/v1/release is a thin FQDN-shaped wrapper that splits at
the first dot and delegates to Allocator.Release.
- The orphan force-release path adds gates (X-Force-Release-Confirm
header, 30-day grace, DNS-NXDOMAIN check) on top of the same seam.
Scope contract with #317 (handover finalisation): NOT touching
internal/handler/handover.go. AdoptedAt is a new contract field on
Deployment + store.Record that the redirect helper consumes; future
#317 enhancement will populate it before deletion.
Files:
core/pool-domain-manager/internal/handler/release.go (NEW)
core/pool-domain-manager/internal/handler/release_test.go (NEW)
core/pool-domain-manager/internal/handler/handler.go (route wiring)
products/catalyst/bootstrap/api/internal/handler/deployments.go (AdoptedAt field + State()/toRecord/fromRecord)
products/catalyst/bootstrap/api/internal/handler/deployments_adopted_test.go (NEW)
products/catalyst/bootstrap/api/internal/store/store.go (AdoptedAt persistence)
products/catalyst/bootstrap/ui/src/pages/sovereign/DecommissionPage.tsx (NEW)
products/catalyst/bootstrap/ui/src/pages/sovereign/DecommissionPage.test.tsx (NEW)
products/catalyst/bootstrap/ui/src/pages/sovereign/Dashboard.tsx (Decommission link)
products/catalyst/bootstrap/ui/src/app/router.tsx (redirect + decom route)
docs/omantel-handover-wbs.md (T319 → done)
Tests: 13 new Go test cases + 5 new vitest cases all green. catalyst-
api + PDM full suites pass. Live execution against omantel deferred to
Phase 8 per ticket scope (no Dynadot/Hetzner exec here).
Co-authored-by: hatiyildiz <hatiyildiz@noreply.github.com>
|
||
|
|
5502d9aa48
|
feat(dns): cert-manager-dynadot-webhook for DNS-01 wildcard TLS (closes #159) (#291)
Activates the previously-templated `letsencrypt-dns01-prod` ClusterIssuer
in bp-cert-manager by shipping the missing piece — a Go binary that
satisfies cert-manager's external webhook contract
(`webhook.acme.cert-manager.io/v1alpha1`) against the Dynadot api3.json.
Architecture
============
* `core/pkg/dynadot-client/` — canonical Dynadot HTTP client (shared with
pool-domain-manager and catalyst-dns). Encapsulates the api3.json
transport, command builders, response decoding, and the safe
read-modify-write semantics required to never accidentally wipe a
zone (memory: feedback_dynadot_dns.md). Destructive `set_dns2`
variant is unexported.
* `core/cmd/cert-manager-dynadot-webhook/` — the cert-manager webhook
binary. Implements `Solver.Present` via the client's append-only
`AddRecord` path and `Solver.CleanUp` via the read-modify-write
`RemoveSubRecord` path. Domain allowlist (`DYNADOT_MANAGED_DOMAINS`)
rejects challenges for unmanaged apexes BEFORE any Dynadot call.
* `platform/cert-manager-dynadot-webhook/` — Catalyst-authored Helm
wrapper. Templates Deployment + Service + APIService + serving
Certificate (CA chain via cert-manager Issuer self-signing) +
RBAC + ServiceAccount. Mirrors the standard cert-manager external-
webhook deployment shape.
* `platform/cert-manager/chart/` — flips `dns01.enabled: true` so the
paired ClusterIssuer activates. The interim http01 issuer remains
templated as the rollback path.
Test results
============
core/pkg/dynadot-client — 7 tests PASS (race-clean)
core/cmd/cert-manager-dynadot-... — 9 tests PASS (race-clean)
Test coverage includes a Present/CleanUp round-trip against an
httptest fixture that models Dynadot's zone state, an explicit
unmanaged-domain rejection, a regression preserving a pre-existing
CNAME across the DNS-01 round-trip (the zone-wipe defence), and a
typed-error propagation test that surfaces `ErrInvalidToken` to
cert-manager so the controller will retry.
Helm template smoke render
==========================
`helm template` against the new chart with default values yields 12
resources / 424 lines (APIService, Certificate, ClusterRoleBinding,
Deployment, Issuer, Role, RoleBinding, Service, ServiceAccount). The
modified bp-cert-manager chart still renders both ClusterIssuers
(`letsencrypt-dns01-prod` + `letsencrypt-http01-prod`) with default
values; flipping `certManager.issuers.dns01.enabled=false` is the
clean rollback.
Smoke command (post-deploy)
===========================
kubectl get apiservices.apiregistration.k8s.io \
v1alpha1.acme.dynadot.openova.io
# Issue a *.<sovereign>.<pool> wildcard cert and watch the
# Order/Challenge progress through cert-manager.
CI
==
`.github/workflows/build-cert-manager-dynadot-webhook.yaml` mirrors the
pool-domain-manager-build pattern (cosign keyless signing, SBOM
attestation, GHCR push at `ghcr.io/openova-io/openova/cert-manager-
dynadot-webhook:<sha>`). Triggered by changes to either the binary or
the shared dynadot-client package.
Closes #159
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
20f5dca902 |
feat(wizard): #169 — StepDomain three-mode (pool / byo-manual / byo-api)
Closes openova#169. Wizard UI: - New StepDomain.tsx with three radio modes (pool / BYO manual NS / BYO registrar API). Pool flow unchanged from #163. BYO-manual surfaces the three OpenOva nameservers (ns1-3.openova.io) verbatim with copy buttons. BYO-api adds a registrar dropdown (Cloudflare, Namecheap, GoDaddy, OVH, Dynadot) + token field + Validate button — read-only validation hits /api/v1/registrar/{r}/validate before Next is enabled. - StepOrg trimmed to org-only fields (domain capture moved to StepDomain). - WizardPage + WizardLayout add the new "Domain" step (now 7 steps total). Wizard store: - DomainMode expanded to 'pool' | 'byo-manual' | 'byo-api' with legacy 'byo' coerced to 'byo-manual' on rehydrate. - New fields: registrarType (RegistrarType | null), registrarToken, registrarTokenValidated. - partialize() strips registrarToken + registrarTokenValidated from localStorage (credential hygiene per docs/INVIOLABLE-PRINCIPLES.md #10). - setSovereignDomainMode cascades a clean reset of irrelevant fields. PDM (core/pool-domain-manager): - New endpoint POST /api/v1/registrar/{registrar}/validate — read-only twin of /set-ns. Calls adapter.ValidateToken; never flips NS records. Maps registrar errors to canonical HTTP statuses (401/403/429/502). Token never enters a logged struct. catalyst-api (products/catalyst/bootstrap/api): - New handler/registrar.go — thin proxy that forwards /api/v1/registrar/{r}/{validate|set-ns} to PDM's matching endpoint, reading the body once and streaming PDM's response status + body verbatim so the wizard's error-mapping vocabulary stays consistent. Tests: - StepDomain.test.tsx — 18 vitest cases covering all three modes, mode-switch field cleanup, validate fetch happy/error paths, token invalidation on edit. - store.test.ts — wizard-store mutations + persist hygiene. - StepSuccess.test.tsx — fixture updated 'byo' -> 'byo-manual'. - registrar_test.go (PDM) — 7 new test cases for /validate covering happy, invalid-token, domain-not-in-account, unsupported-registrar, missing-fields, bad-JSON, response-doesnt-leak-token. 103 vitest cases pass. Go tests pass for both PDM and catalyst-api. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
a6fb7410f4 |
feat(pdm): per-Sovereign PowerDNS zones for #168
Refactor pool-domain-manager to own per-Sovereign zones in PowerDNS,
replacing the previous Dynadot-set_dns2 record-write flow.
Phase 1 — internal/pdns: REST client for PowerDNS Authoritative API
- CreateZone / DeleteZone / EnsureZone / ZoneExists
- PatchRRSets (atomic batch RRset writes)
- AddARecord / AddNSDelegation / RemoveNSDelegation
- EnableDNSSEC: PUT dnssec flag, generate KSK+ZSK (algorithm 13
ECDSAP256SHA256 per docs/PLATFORM-POWERDNS.md), POST rectify
- retry-once-on-5xx with exponential backoff (250ms, 1s)
- X-API-Key header from K8s Secret, never logged
- 22 unit tests covering every method against httptest mock
Phase 2 — allocator: DNSWriter interface + per-Sovereign lifecycle
- /reserve: insert pdm-pg row + create child zone with apex NS
RRset + add NS delegation into parent + enable DNSSEC on child
- /commit: write the canonical 6-record set (apex, *, console,
api, gitea, harbor) into child zone, TTL 300, atomic PATCH
- /release: drop child zone (DNSSEC keys retire) + remove parent
NS delegation, idempotent on 404
- sweeper teardowns DNS for expired reservations before deleting
pdm-pg rows
- rollback path on Reserve failure preserves operator UX
- allocator_test.go: fake DNSWriter for state-machine assertions
Phase 3 — startup parent-zone bootstrap
- BootstrapParentZones runs at PDM startup before HTTP serves
- EnsureZone for every entry in DYNADOT_MANAGED_DOMAINS
- DNSSEC enabled on each parent zone (idempotent)
- PDM exits non-zero if bootstrap fails
Phase 4 — schema unchanged
- child zone name derived as <subdomain>.<poolDomain>, no new column
- existing pool_allocations table works as-is
Phase 5 — dynadot package trimmed
- removed AddSovereignRecords / DeleteSubdomainRecords / AddRecord /
getZone / writeZone (Dynadot DNS write code)
- kept IsManagedDomain / ManagedDomains / ResetManagedDomains /
ErrUnmanagedDomain (config-resolution helpers)
- registrar adapter at internal/registrar/dynadot/ untouched (handles
BYO Flow B NS-delegation via #170)
Phase 6 — env-var contract
PDM_PDNS_BASE_URL, PDM_PDNS_API_KEY, PDM_PDNS_SERVER_ID, PDM_NAMESERVERS
all runtime-configurable per docs/INVIOLABLE-PRINCIPLES.md #4.
Quality bar (all met):
- DNSSEC enabled on every child zone (mandatory per spec)
- parent NS delegation TTL 3600, child A-record TTL 300
- retry-once-on-5xx with exponential backoff in pdns client
- all credentials flow from env vars sourced from K8s Secrets
- no hardcoded URLs, regions, or NS endpoints
Closes openova#168 (DNS-side; private-repo manifest update lands separately).
|
||
|
|
567d7e1f60 |
feat(pdm): registrar adapters for Cloudflare, Namecheap, GoDaddy, OVH, Dynadot (#170)
Adds the BYO Flow B (#166) registrar-flip seam: PDM now exposes a provider-agnostic Registrar interface and 5 adapter implementations plus a new HTTP endpoint that dispatches to them. Wire surface - POST /api/v1/registrar/{registrar}/set-ns Body: {"domain":"...","token":"...","nameservers":["..."]} Reply: {"success":true,"registrar":"...","domain":"...", "nameservers":["..."],"propagation":"..."} - GET /healthz now lists the wired-in registrar names. Interface (internal/registrar/registrar.go) - Name(), ValidateToken, SetNameservers, GetNameservers - Typed errors: ErrInvalidToken, ErrRateLimited, ErrDomainNotInAccount, ErrAPIUnavailable, ErrUnsupportedRegistrar - Registry map[string]Registrar with Lookup + Names helpers Adapters - internal/registrar/cloudflare/ — API v4 with Bearer token; verifies via /user/tokens/verify, looks up zone by name, PATCHes name_servers - internal/registrar/namecheap/ — XML API; ApiUser+ApiKey+UserName+ ClientIp auth; getBalances probe + getList domain check; setCustom for write. IP-whitelisting requirement documented in source comments - internal/registrar/godaddy/ — v1 API with sso-key auth; GET /v1/domains list + PATCH /v1/domains/{d} with nameServers body - internal/registrar/ovh/ — request signing (HMAC-SHA1 over appSecret+consumerKey+method+url+body+timestamp); GET /domain probe; POST /domain/{d}/nameServers/update for write; GET .../nameServer[/{id}] for read - internal/registrar/dynadot/ — api3.json with key+secret as colon- separated token; uses set_ns + domain_info commands. Distinct from the existing internal/dynadot package which is the DNS-record writer for OpenOva-managed pool domains (different concern: pool DNS vs. customer-domain registrar NS-flip) Token hygiene (per docs/INVIOLABLE-PRINCIPLES.md #10) - Tokens never persisted: in-memory only for the duration of the call - Never logged: handler uses classifyOutcome to render redacted outcome labels, never the raw error message or token - Never echoed in responses - TestSetNSResponseDoesNotEchoToken + TestSetNSHappy assert no token bytes appear in JSON body or zerolog/slog output Tests - 74 new unit tests (httptest server per adapter): cloudflare 11, dynadot 11, godaddy 11, namecheap 13, ovh 12, handler 14, registrar interface 2 - Each adapter covers: happy path, bad-token, rate-limited (429), bad-domain (404 / not-in-account), empty-NS guard, name+default - OVH signature math verified deterministically via injected nowFn Acceptance (issue #170) - All 5 adapters pass their unit tests - PDM /api/v1/registrar/{r}/set-ns endpoint live - Wired into cmd/pdm/main.go: every adapter registered at startup Per docs/INVIOLABLE-PRINCIPLES.md #4 (never hardcode), each adapter's BaseURL is constructor-default + struct-overridable, so tests inject httptest endpoints without environment shenanigans. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
585b046f5d |
feat(pdm): pool-domain-manager service skeleton (Phase 1 of #163)
Build a new Go service core/pool-domain-manager that becomes the SOLE
authority for OpenOva-pool subdomain allocation across the fleet.
Why this exists: today products/catalyst/bootstrap/api/internal/handler/
subdomains.go does naive net.LookupHost() to decide whether a candidate
subdomain is taken. Dynadot's wildcard parking record at the apex of
omani.works (and any future pool domain) makes EVERY subdomain resolve
to 185.53.179.128, so the check rejects everything. DNS is the wrong
source of truth for an OpenOva-managed pool — the central control plane
must own the allocation table.
What this commit adds (no integration with catalyst-api yet — that lands
in a follow-up commit):
core/pool-domain-manager/
cmd/pdm/main.go chi router, healthz, sweeper boot
api/openapi.yaml wire contract for every endpoint
Containerfile alpine final stage, UID 65534
internal/store/ pgx + CNPG; pool_allocations table
migrations.sql idempotent CREATE TABLE schema
store.go Reserve/Get/Commit/Release/List
store_test.go integration tests (PDM_TEST_DSN)
internal/dynadot/ moved + extended; SOLE Dynadot caller
dynadot.go AddRecord, AddSovereignRecords,
DeleteSubdomainRecords (read-modify-
write to honour feedback_dynadot_dns)
dynadot_test.go managed-domain resolution tests
internal/reserved/ centralised reserved-name list
reserved.go IsReserved/All; pulled out of
catalyst-api's subdomains.go
internal/handler/ HTTP surface
handler.go /api/v1/pool/{domain}/{check,reserve,
commit,release,list}, /healthz,
/api/v1/reserved
internal/allocator/ state machine + sweeper goroutine
Architecture choices and how they map to docs/INVIOLABLE-PRINCIPLES.md:
- Principle #4 (never hardcode): every value (PORT, PDM_DATABASE_URL,
DYNADOT_MANAGED_DOMAINS, PDM_RESERVATION_TTL, PDM_SWEEPER_INTERVAL)
flows from env vars; the K8s ExternalSecret will populate them at
deploy time. The reserved-subdomain list lives in ONE place
(internal/reserved); catalyst-api will not duplicate it.
- Principle #2 (no quality compromise): the state machine commits the
DB row before the Dynadot side-effect, so a crash between the two
leaves the system in a recoverable state (operator runs Release).
The reservation_token in the row protects against stale-tab commit
races. UPSERT semantics + a CHECK constraint mean two operators
racing /reserve get a clean 23505 (unique_violation) → HTTP 409.
- Principle #3 (follow architecture): PDM is a ClusterIP service in
openova-system — it is not a Crossplane provider, not a Flux
HelmRelease, not bespoke OpenTofu state. catalyst-api speaks to it
via plain HTTP. The Crossplane Composition that wraps PDM as a
declarative MR (XDynadotPoolAllocation) lands in a follow-up phase.
The DNS-wildcard problem the issue describes is fixed STRUCTURALLY here:
PDM never calls net.LookupHost. The /check path is a single SELECT
against pool_allocations. omani.works's wildcard A record at the apex
becomes architecturally irrelevant.
Tests exercised in this commit:
- internal/reserved: full unit coverage (case-insensitive, sorted, set
membership)
- internal/dynadot: managed-domain runtime resolution (env-var,
legacy single-domain fallback, built-in defaults, list parsing)
- internal/store: integration suite gated on PDM_TEST_DSN env var,
covers reserve happy-path, reserve race (ErrConflict), TTL expiry
frees, commit happy-path, commit token mismatch, release removes
row, sweeper deletes expired rows
Closes phase 1 of #163. Phase 2 (catalyst-api wiring), Phase 3 (CI +
manifests), Phase 4 (Crossplane composition), Phase 6 (deploy +
verification curl) follow in separate commits.
Refs: #163
|
||
|
|
9519c1ef00 | merge: Group L testing (Playwright e2e smoke tests, Hetzner provisioning test scaffold gated on HETZNER_TEST_TOKEN secret, integration tests for bootstrap installer + Dynadot + voucher) | ||
|
|
7edf63ca7e |
docs(franchise),test(billing): voucher CRD propagation invariant
#118 verifies that the voucher shape on a franchised Sovereign is identical to Catalyst-Zero. Two artefacts: 1. New §"Voucher shape propagates automatically" in docs/FRANCHISE-MODEL.md explaining WHY there is no propagation problem to solve: vouchers are not a CRD. They are rows in the per-Sovereign billing service's Postgres database, and every Sovereign runs the same SHA-pinned core/services/billing image. Same image → same migration → same schema → same handlers → same shape. The doc lists which file owns each part of the shape and includes a 4-step curl smoke test to run on any Sovereign at first-provisioning to confirm the invariant holds. 2. New core/services/billing/handlers/vouchers_test.go covering the public POST /billing/vouchers/redeem-preview endpoint added in #117. Four cases: - 404 on unknown / soft-deleted code (no tombstone leak) - 200 on a valid live code, asserting the public shape excludes times_redeemed and max_redemptions (defence-in-depth against enumeration) - 410 Gone on a code that exists but has hit its cap, with the credit/description still in the response so the landing page can show "campaign ended" - 400 on whitespace-only input The tests run on every CI build of the billing service, on every Sovereign that builds from this repo. If a future change drifts the preview endpoint's shape, the tests fail before the regression can ship. Also tidies vouchers.go imports (removed two unused stdlib imports that were placeholder). Closes #118. |
||
|
|
9404632830 |
feat(marketplace): public /redeem?code=... voucher landing flow
#116 adds the public landing page that the franchise model relies on to convert voucher distribution into Catalyst signups (per docs/FRANCHISE-MODEL.md §3, "redemption flow end-to-end"). New page core/marketplace/src/pages/redeem.astro: - Reads ?code=... from the URL (or accepts manual entry if absent). - POSTs to /api/billing/vouchers/redeem-preview (added in #117) — does NOT consume the voucher, just validates it. - Renders one of four states: * Valid (200): "X OMR credit" + description + "Sign up to redeem" CTA. The CTA stashes the code in localStorage under `sme-pending-voucher` and routes to /plans (the start of the existing signup wizard). * Campaign ended (410): inactive or capped — shows the credit that was offered + a path to sign up without a voucher. * Not valid (404): never existed or soft-deleted (#91 tombstone-leak protection — the two are indistinguishable on the public surface). * No code present: a manual input form so a redeemer who landed on /redeem without a query string can paste their code. CheckoutStep wiring (core/marketplace/src/components/CheckoutStep.svelte): - The `promoCode` $state now hydrates from `sme-pending-voucher` so a redeemer arriving via /redeem reaches /checkout with the field pre-filled. They can still edit or clear it. - After submitting to /billing/checkout, we clear the localStorage stash. This prevents a second signup on the same browser from silently carrying over the previous voucher. The actual redemption (insert into promo_redemptions, increment times_redeemed, credit_ledger entry) still happens transactionally inside POST /billing/checkout — splitting it out would risk a partially-redeemed code with no Order to show for it (the same class of bug #91 fixed). Per docs/INVIOLABLE-PRINCIPLES.md §1: target-state shape, not MVP. The page handles all four observable backend states; manual-entry fallback is included; the "campaign ended" path keeps the user moving into signup rather than dead-ending. Closes #116. |
||
|
|
12387a4a74 |
feat(billing): /billing/vouchers/{issue,list,revoke,redeem-preview} surface
#117 adds a franchise-aligned URL surface for the existing PromoCode voucher implementation, plus one new endpoint (redeem-preview) for the public landing flow described in docs/FRANCHISE-MODEL.md §3. The orchestrator's hint was right — the issue/list/revoke handlers already exist (AdminUpsertPromo / AdminListPromos / AdminDeletePromo on the legacy /billing/admin/promos surface). This commit: 1. Adds new endpoint handlers in core/services/billing/handlers/vouchers.go: - POST /billing/vouchers/issue (superadmin or sovereign-admin) - GET /billing/vouchers/list (superadmin or sovereign-admin) - DELETE /billing/vouchers/revoke/{code} (superadmin or sovereign-admin) - POST /billing/vouchers/redeem-preview (unauthenticated; public) The first three reuse the existing store-layer methods. The last is new — it validates a code without consuming it, returning a safe shape (no times_redeemed, no max_redemptions exposure) so an attacker scraping the public endpoint cannot enumerate cap status. 2. Distinguishes 404 (code never existed or soft-deleted — same tombstone-leak protection as #91) from 410 Gone (code exists but is inactive or capped). The 410 body still includes the credit and description so the landing page can show "this campaign has ended". 3. Keeps the legacy /billing/admin/promos endpoints in place — the existing admin UI continues to work without any breaking change. New code should target /billing/vouchers/... 4. Updates docs/FRANCHISE-MODEL.md to point to the new URL surface. The actual REDEMPTION still happens transactionally inside POST /billing/checkout via the `promo_code` field — that path locks the promo row, inserts the promo_redemptions edge, increments times_redeemed, and adds the credit_ledger entry in one transaction. Splitting it into a separate /redeem endpoint would break that atomicity, so we deliberately do not add one. The public redeem flow is preview → signup → checkout-with-promo_code. Closes #117. |
||
|
|
3e956b7d81 |
test: voucher issuance integration test — real Postgres (#147)
Closes the Group L "integration test — voucher issuance via API — issue → redeem → Org created path" ticket. Per docs/INVIOLABLE-PRINCIPLES.md principle #2 (no mocks where the test would otherwise verify real behavior), this test runs against a real PostgreSQL — not sqlmock. The voucher mechanic lives in store.RedeemPromoCode which runs a transaction with SELECT FOR UPDATE on promo_codes, COUNT lookup on promo_redemptions, and inserts into credit_ledger. Mocking SQL strings doesn't verify whether the transactional invariants actually hold under concurrent contention; this codebase has been bitten by exactly that gap before (#93: counter incremented before order was committed). The test is gated on BILLING_TEST_PG_URL — when unset, it skips (NOT mocks). CI populates it via the new postgres service container in .github/workflows/test-billing-integration.yaml. Each test gets its own Postgres schema (via CREATE SCHEMA + libpq's options=-c search_path) so parallel runs don't cross-contaminate, and so goroutine concurrency tests reliably hit the same schema regardless of which pooled connection they pick up. Coverage: - Issue → Redeem → Credit applied (the canonical happy path) - Per-customer double-redemption blocked - Redemption cap enforced under concurrency (12 goroutines fighting for a 5-cap voucher → exactly 5 successful redemptions, no more) - Soft-deleted codes rejected as "not found" (no tombstone leak per #91) - Inactive codes rejected with distinct "not active" error - Two different customers can each redeem the same voucher - Org-creation prerequisites: customer.tenant_id non-empty, balance > 0 (these are the inputs the downstream tenant.created event consumer feeds into CreateTenant — covered by tenant-service consumer_test.go) CI workflow added: .github/workflows/test-billing-integration.yaml runs the tests against a postgres:16-alpine service container with -race. Refs #147 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
fabedd42c1 |
feat(admin,billing): per-Sovereign voucher issuance for sovereign-admin
#115 extends the existing PromoCode (voucher) admin surface so a sovereign-admin role can issue, list, and revoke vouchers on a franchised Sovereign. No new endpoints, no new schema, no new CRD — all the changes are role-gating widenings on the existing surface. Backend (core/services/billing/handlers/handlers.go): - New `requireVoucherIssuer` helper accepts both `superadmin` and `sovereign-admin`. Used by AdminListPromos, AdminUpsertPromo, and AdminDeletePromo only. All other admin endpoints (Stripe settings, revenue, orders) keep the existing `requireAdmin` (superadmin-only). UI (core/admin/src/components/AdminShell.svelte + BillingPage.svelte): - AdminShell now accepts both roles. Sidebar nav is filtered by role: superadmin sees Revenue / Catalog / Tenants / Orders / Billing; sovereign-admin sees only Billing. Filtering is via a `superadminOnly` flag on each nav item (defence-in-depth: even if a sovereign-admin guesses a URL, the backend's requireAdmin will return 403). - BillingPage hides the Stripe Configuration section for sovereign-admin (it would 403 from GET /billing/admin/settings anyway). The Vouchers (Promo Codes) section is shown to both roles with a small label tweak ("Issued vouchers are scoped to this Sovereign" for sovereign-admin). Per docs/INVIOLABLE-PRINCIPLES.md §1 (target-state shape, no MVP) and §3 (follow documented architecture exactly) — this matches the FRANCHISE-MODEL.md design where "every franchised Sovereign runs the same admin app" with role-based gating. Closes #115. |
||
|
|
7646840ffe |
feat(consolidation): move 8 SME backend services + shared module to public repo
Per docs/PROVISIONING-PLAN.md and tickets [B] sme-backend group. Migrates the 8 Go backend services from openova-private/services/ to openova/core/services/, plus the shared module they all depend on, plus the services-build CI workflow.
What moved:
- services/auth → core/services/auth (Go HTTP service for SME marketplace authentication)
- services/billing → core/services/billing (Go HTTP service for billing + voucher backend)
- services/catalog → core/services/catalog (Go HTTP service for App catalog)
- services/domain → core/services/domain (Go HTTP service for tenant domain mapping)
- services/gateway → core/services/gateway (Go HTTP gateway with rate limiting)
- services/notification → core/services/notification (Go HTTP service with email templates)
- services/provisioning → core/services/provisioning (Go HTTP service that commits tenant Application manifests via Gitea/GitHub API)
- services/tenant → core/services/tenant (Go HTTP service for tenant lifecycle)
- services/shared → core/services/shared (shared Go module: db, events, health, middleware, respond)
- 9 go.mod files updated: module github.com/openova-io/openova-private/services/<X> → github.com/openova-io/openova/core/services/<X>
- 9 go.sum and import paths similarly updated
- replace directives updated: openova-private/services/shared → openova/core/services/shared
- sme-services-build.yaml workflow → services-build.yaml in .github/workflows/, paths/context/image-base/deploy paths all repointed at core/services + ghcr.io/openova-io/openova/services-* + products/catalyst/chart/templates/sme-services
- All 8 manifests in products/catalyst/chart/templates/sme-services/ updated: image refs ghcr.io/openova-io/openova-private/sme-{X} → ghcr.io/openova-io/openova/services-{X}
- provisioning.yaml GITHUB_REPO env var: "openova-private" → "openova"
Closes [B] sme-backend (10 tickets).
After this commit, all 14 user-facing + backend Catalyst-Zero modules build from this public repo:
- 4 UIs: console, admin, marketplace, catalyst-ui
- 2 backends: marketplace-api, catalyst-api
- 8 SME services: auth, billing, catalog, domain, gateway, notification, provisioning, tenant
- 1 shared Go module
Note: 1 line in core/services/provisioning/main.go retains a literal default of "openova-private" for the GITHUB_REPO fallback when env var is unset; the K8s manifest sets GITHUB_REPO=openova explicitly so this path is never exercised in the deployed runtime, and the in-code default will be cleaned up in a follow-up.
|
||
|
|
3c2f7e4cda |
feat(consolidation): Phase 1 — move Catalyst-Zero apps + CI + manifests into public monorepo
Per docs/PROVISIONING-PLAN.md Phase 1. Catalyst-Zero (the running deployment on Contabo k3s, namespaces catalyst/sme/marketplace/website) source code now lives in this public repo. Cutover to public-repo CI builds happens in Phase 2.
What moved (from openova-private → openova):
- apps/console/ → core/console/ (Astro+Svelte UI)
- apps/admin/ → core/admin/ (Astro+Svelte UI, includes canonical voucher/billing/tenants admin surface)
- apps/marketplace/ → core/marketplace/ (Astro+Svelte UI, 5-step Plan→Apps→Addons→Checkout→Review flow)
- website/marketplace-api/ → core/marketplace-api/ (Go backend with handlers/, provisioner/, store/)
- clusters/contabo-mkt/apps/catalyst/ → products/catalyst/chart/templates/ (catalyst-{ui,api} K8s manifests)
- clusters/contabo-mkt/apps/sme/services/ → products/catalyst/chart/templates/sme-services/ (15 manifests)
- clusters/contabo-mkt/apps/marketplace-api/ → products/catalyst/chart/templates/marketplace-api/
- 5 CI workflows (catalyst-build, marketplace-api-build, sme-{admin,console,marketplace}-build) → .github/workflows/, renamed to drop "sme-" prefix
Image refs updated:
- ghcr.io/openova-io/openova-private/catalyst-{ui,api} → ghcr.io/openova-io/openova/catalyst-{ui,api}
- ghcr.io/openova-io/openova-private/sme-{admin,console,marketplace} → ghcr.io/openova-io/openova/{admin,console,marketplace}
- ghcr.io/openova-io/openova-private/marketplace-api → ghcr.io/openova-io/openova/marketplace-api
Workflow path updates:
- paths: 'apps/{X}/**' → 'core/{X}/**'
- context: apps/{X} → core/{X}
- deploy paths: clusters/contabo-mkt/apps/{X}/.../{X}.yaml → products/catalyst/chart/templates/.../{X}.yaml
- deploy commit: git add clusters/ → git add products/
Deferred to follow-up phase:
- 8 legacy SME backend services (auth, billing, catalog, domain, gateway, notification, provisioning, tenant) keep their ghcr.io/openova-io/openova-private/sme-* image refs because their source code in openova-private/services/ has not yet been migrated to public repo. Tracked via TODO in core/README.md migration history.
- sme-services-build.yaml NOT migrated (matches deferred services).
Documentation updates:
- core/README.md rewritten to describe what's actually in this directory now (4 deployed modules, not the old Go-monorepo placeholder design)
- products/catalyst/README.md created with migration status table
- products/catalyst/chart/Chart.yaml created (umbrella bp-catalyst-platform chart)
- docs/IMPLEMENTATION-STATUS.md §1 + §2.1 + §6 updated: console/admin/marketplace/marketplace-api/catalyst-{ui,api} all flipped from 📐 to 🚧 (deployed but not yet wired to unified Catalyst contract); openova Sovereign description rewritten to make Catalyst-Zero status explicit; omantel target updated to omantel.omani.works on Hetzner.
Verification:
- 99 source files copied (verified via git ls-files count)
- All image refs updated except the 8 deferred legacy SME backend services (verified via grep openova-private)
- Workflow naming reflects unified Catalyst (no more "sme-" prefix)
Phase 2 next: trigger public-repo CI builds, GHCR images published under openova/ namespace, Flux source on Catalyst-Zero repointed to this repo, rolling update of Contabo pods to new image SHAs. Catalyst-Zero becomes self-built from the public repo.
|
||
|
|
b00ec8f4df |
docs(pass-30): core/README catalyst-provisioner scope confusion + neo4j clean
core/README.md "User journeys" table had: "Sovereign bootstrap | Phase 0
done by catalyst-provisioner; this codebase contains the OpenTofu modules
under apps/provisioning/opentofu/..." — conflating two distinct services.
Per SOVEREIGN-PROVISIONING.md §2, catalyst-provisioner is a separate
Blueprint (bp-catalyst-provisioner) — explicitly "not part of any
Sovereign at runtime" — and lives outside core/. The core/apps/provisioning/
service is for runtime Application provisioning (validate configSchema,
compose manifests, commit to Environment's Gitea repo), an entirely
different concern from Phase 0 Sovereign bootstrap. Rewritten to call out
the separation.
platform/neo4j/README.md: clean.
Recurring shorthand note: ws.<env>.> JetStream subjects in core/README +
ARCHITECTURE (5 instances) treated as documented shorthand — precise form
per NAMING §11.2 is ws.{org}-{env_type}.>. Tightening deferred.
Validation log Pass 30 entry added.
|
||
|
|
27325edb32 |
docs(iter-2): glossary alignment — rename workspace-controller, fix definitions
GLOSSARY.md line-by-line audit. Eight corrections.
1. workspace-controller → environment-controller everywhere. The
controller reconciles the Environment CRD; "workspace" is banned as
a Catalyst scope, so it cannot be in a component name either. Fixed
in: GLOSSARY, ARCHITECTURE, PLATFORM-TECH-STACK, NAMING-CONVENTION,
SOVEREIGN-PROVISIONING, IMPLEMENTATION-STATUS, core/README,
BUSINESS-STRATEGY. Banned-term entry in GLOSSARY now explicitly
covers component names too.
2. "workspace repos" (per-Environment Gitea repos) → "Environment
Gitea repos" in GLOSSARY, PLATFORM-TECH-STACK.
3. JWT claim {workspace, org, role} → {environment, org, role} in
ARCHITECTURE projector diagram.
4. OpenOva definition refined: was "Never used to name a product",
which contradicted "OpenOva Catalyst", "OpenOva Cortex". Now: brand
prefix in product names; bare "OpenOva" = the company; bare
"Catalyst" = the platform.
5. Catalyst definition completed: was missing provisioning, billing,
gitea, observability — now lists all 14 control-plane components,
pointing at the table below.
6. Catalyst components table: added `provisioning` (validates
configSchema, commits to Environment Gitea); reordered to match
ARCHITECTURE §3 grouping; clarified each component's source-of-truth
(catalog-svc reads monorepo + Gitea, blueprint-controller watches
monorepo + Gitea, etc.).
7. Environment definition: refers to NAMING §2.4 for env_type values;
removed inline list that didn't match canonical ordering. Added
concrete examples (acme-prod, acme-dev, bankdhofar-uat).
8. Application example: dropped "RocketChat" which appeared nowhere
else; replaced with generic "running deployment" plus the
established WordPress / Postgres examples.
9. sovereign-admin description: was "runs Crossplane" — Crossplane is
platform plumbing not user-facing. Now: "manages the underlying
clusters via Crossplane (which is platform plumbing, not a
user-facing surface)".
Banned-term coverage:
- "Workspace" entry now covers BOTH the Catalyst scope AND component
naming (workspace-controller → environment-controller).
Refs #37
|
||
|
|
2c4902b409 |
docs(iter-1): add IMPLEMENTATION-STATUS, fix wrong-org refs, reconcile monorepo
First validation iteration. Three concrete corrections. 1. Add docs/IMPLEMENTATION-STATUS.md as the bridge between target architecture and current code state. Status legend (✅ / 🚧 / 📐 / ⏸) applied per-component. Catalyst control plane = mostly 📐. Component READMEs = 🚧 (README only, no Blueprint manifests yet). products/axon = ✅ (only product with real code). core/ = 📐 (just .gitkeep). 2. Status banner added to ARCHITECTURE, SECURITY, SOVEREIGN-PROVISIONING, BLUEPRINT-AUTHORING, PERSONAS-AND-JOURNEYS, PLATFORM-TECH-STACK, SRE pointing readers at IMPLEMENTATION-STATUS.md before they treat any described feature as built. GLOSSARY also references it. 3. Architectural decision (Option A — monorepo canonical): - Each platform/<name>/ and products/<name>/ folder is the source of ONE Blueprint, published as ghcr.io/openova-io/<name>:<semver> by CI fan-out from the monorepo root. - BLUEPRINT-AUTHORING.md §1, §2, §13 rewritten to match. - README.md "what's in this repo" rewritten to clarify monorepo + OCI-fan-out shape; no longer claims every directory is a Blueprint in a way that contradicts BLUEPRINT-AUTHORING. Wrong-org fixes (3 places): - docs/PERSONAS-AND-JOURNEYS.md:13 github.com/openova → openova-io - docs/BLUEPRINT-AUTHORING.md:13 github.com/openova → openova-io - docs/BLUEPRINT-AUTHORING.md:404 github.com/openova → openova-io - docs/BLUEPRINT-AUTHORING.md ghcr.io/openova/* (3 refs) → openova-io API group consistency: - All references unified to catalyst.openova.io/v1alpha1 (was mixed v1 / v1alpha1; v1alpha1 is correct since the CRDs are design-stage with no implementation). core/README.md updated to honestly describe the directory tree as "target structure with .gitkeep placeholders" rather than implying the apps/console, apps/projector, etc. binaries already exist. The legacy apps/bootstrap and apps/manager directories are acknowledged as transitional placeholders that will be removed when the new apps/ layout is scaffolded. CLAUDE.md and .claude/project-memory.md updated to put IMPLEMENTATION-STATUS.md second in the read-first ordering. Refs #37 |
||
|
|
039a724f31 |
docs: rewrite repository foundation around Catalyst as the platform
Repositions the public repo's identity. OpenOva is the company; Catalyst
is the platform. Sovereign is a deployed Catalyst. The historical
positioning (OpenOva = platform, Catalyst = bootstrap+IDP+lifecycle
sub-product) is retired. Catalyst now subsumes bootstrap, lifecycle, and
IDP responsibilities into one control plane.
- README.md Catalyst-first front door. Sovereign concept,
repo structure, stack at a glance, cloud
provider matrix, getting-started paths
(managed via marketplace.openova.io vs
self-host via catalyst-provisioner).
- CLAUDE.md Codebase guide for Claude. Banned-term table,
commit conventions (hatiyildiz default for
public repo), the no-fourth-surface rule,
per-component README rule of thumb.
- .claude/project-memory.md Reduced to an index + decision log;
full architecture moved to docs/. Stack
decisions locked (NATS JetStream, OpenBao,
SPIFFE/SPIRE, per-Org Keycloak SME / per-
Sovereign corporate, Crossplane only IaC,
no Terraform/Pulumi user-facing surface).
- core/README.md Catalyst control-plane Go application. Drops
the bootstrap-vs-manager split (both fold under
"Catalyst control plane"). Lists each component
deployable from this codebase: console,
marketplace, admin, projector, catalog-svc,
provisioning, workspace-controller, blueprint-
controller, billing. CRD list updated:
Sovereign / Organization / Environment /
Application / Blueprint / EnvironmentPolicy /
SecretPolicy / Runbook.
Refs #37
|
||
|
|
54b1b4bd3d |
docs: add unified naming convention and align existing docs
- Add docs/NAMING-CONVENTION.md — canonical naming standard for all cloud resources, K8s objects, DNS, and tags across all providers. Covers dimension taxonomy (provider/region/building-block/environment), the Don't-Repeat-the-Parent principle, 4-char DNS location codes with full lookup table, multi-tenant scoping via namespace, and migration rules. - Fix SRE.md: remove primary/DR region labels; clusters are named by building block (rtz/dmz/mgt), not failover role. Both regions run symmetric rtz clusters; k8gb owns traffic distribution. - Fix PLATFORM-TECH-STACK.md: update both Mermaid diagrams and region table to use Region A / Region B (rtz cluster) language. - Fix core/README.md: Platform CRD example now references cluster context names (hz-fsn-rtz-prod / hz-hel-rtz-prod) instead of primary/standby roles. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
435f49738d |
feat: restructure platform to 52 components and 9 products
Technology forecast and strategic review restructure: - Remove 13 components (backstage, mongodb, activemq, vitess, airflow, camel, dapr, superset, searxng, langserve, trino, lago, rabbitmq) - Add 10 components (sigstore, syft-grype, nemo-guardrails, langfuse, reloader, matrix, ferretdb, litmus, livekit, coraza) - Rename product: Synapse → Axon (SaaS LLM Gateway) - Merge products: Titan + Fuse → Fabric (Data & Integration) - New product: Relay (Communication) - Replace Backstage with Catalyst IDP - Replace MongoDB with FerretDB (MongoDB wire protocol on CNPG) - Add supply chain security (Sigstore/Cosign, Syft+Grype) - Add AI safety and observability (NeMo Guardrails, LangFuse) - Add technology forecast 2027-2030 document - Full verification pass: zero stale references across all docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
10245dff98 |
feat: ecosystem expansion to 55 components with license compliance
- Replace BSL-licensed components with open-source alternatives: Terraform→OpenTofu (MPL 2.0), Vault→OpenBao (MPL 2.0), Redpanda→Strimzi/Kafka (Apache 2.0), n8n→Airflow (Apache 2.0) - Add 14 new platform components: activemq, camel, clickhouse, dapr, debezium, falco, flink, iceberg, opensearch, rabbitmq, superset, temporal, trino, vitess - Rename meta-platforms/ to products/ with new product names: Cortex (AI Hub), Fingate (Open Banking), Titan (Data Lakehouse), Fuse (Microservices Integration) - Update all documentation, READMEs, and cross-references Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> |
||
|
|
535710289c |
feat: create OpenOva monorepo structure
Consolidate all component repos into a single monorepo: - core/: Bootstrap + Lifecycle Manager application - platform/: Individual component blueprints organized by category - networking/ (cilium, k8gb, external-dns, stunner) - security/ (cert-manager, external-secrets, vault, kyverno, trivy) - observability/ (grafana stack) - storage/ (minio, harbor, velero) - scaling/ (keda, vpa) - failover/ (failover-controller) - gitops/ (flux, gitea) - idp/ (backstage) - data/ (cnpg, mongodb, valkey, redpanda) - communication/ (stalwart) - iac/ (terraform, crossplane) - identity/ (keycloak) - meta-platforms/: Bundled vertical solutions - ai-hub/ (enterprise AI platform) - open-banking/ (PSD2/FAPI fintech sandbox) - docs/: Platform documentation (PLATFORM-TECH-STACK.md, SRE.md) All internal links updated to use relative paths within monorepo. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> |