* feat(metering): NewAPI NATS publisher + sme-billing subscriber + POST /metering/record (#798)
Per #795 [Q-mine-3] (NATS not RedPanda) + [Q-mine-4] (one ledger), add
the SME-2 metering integration end-to-end. NewAPI is consumed as the
upstream image `ghcr.io/openova-io/openova/newapi-mirror` (a pinned
mirror, not a fork) — the metering envelope is produced by a Go sidecar
that observes the OpenAI-style `usage.total_tokens` field on every
2xx /v1/* response. This avoids forking the upstream binary while still
producing the canonical envelope shape on `catalyst.usage.recorded`.
A) NewAPI metering sidecar — core/services/metering-sidecar/
- Transparent reverse proxy in front of NewAPI on its own port; the
bp-newapi Service routes the cluster-fronting port to the sidecar,
which forwards to NewAPI on the pod's loopback.
- Observes successful /v1/* JSON responses, parses
`usage.{prompt_tokens,completion_tokens,total_tokens}`, computes
amount_micro_omr = -tokens * priceMicroOMRPerToken, and publishes
one envelope on `catalyst.usage.recorded` per completed request.
- Failed (non-2xx), non-JSON, and admin-path requests are NOT billed.
- Customer-facing latency is NEVER blocked on metering: the response
body is restored before publish; on NATS unreachable the envelope
is persisted to disk and retried by a background drain loop.
- 14 unit tests (proxy + publisher + safeFilename guards).
B) sme-billing NATS subscriber — core/services/billing/handlers/
metering_consumer.go
- JetStream durable consumer `sme-billing-metering` on stream
`CATALYST_USAGE` (provisioned by sme-billing on startup).
- Idempotent on metadata.request_id via a UNIQUE partial index on
credit_ledger.external_ref; redelivery from the broker collapses
to a single ledger row.
- Customer auto-create on cold start (the rbac sme.user.created
envelope may land AFTER the first metered request; we don't strand
usage waiting for it).
- 11 unit tests covering happy-path, idempotency, malformed-payload
poison-pill, missing-request-id, non-negative amount guard,
resolver error → Nak, derive-micro-OMR-from-OMR, DB-error → Nak.
C) HTTP handler POST /billing/metering/record — handlers/metering.go
- Synchronous validate → INSERT credit_ledger → return
{ledger_entry_id, balance_after_omr, balance_after_micro_omr,
duplicate}. Same payload + idempotency guard as the NATS path.
- Auth: superadmin OR sovereign-admin (operator-admin model;
end-user LLM traffic flows through the sidecar, never this URL).
- 8 unit tests covering happy-path, idempotency, role gating,
malformed-JSON, positive-amount rejection, customer-not-found.
D) Schema — core/services/billing/store/store.go
- ALTER TABLE credit_ledger ADD COLUMN amount_micro_omr BIGINT
(1 OMR = 1,000,000 micro-OMR; -0.000234 OMR = -234 micro-OMR
exact integer — preserves precision at metering rates).
- ADD COLUMN external_ref TEXT + UNIQUE partial index for
idempotency dedup.
- ADD COLUMN metadata JSONB for the raw envelope.
- GetCreditBalance projects both amount_omr (legacy) and
amount_micro_omr (new) into the integer-OMR view.
- GetCreditBalanceMicroOMR returns canonical precision.
- RecordUsage method: ON CONFLICT DO UPDATE … RETURNING (xmax<>0)
distinguishes fresh insert from duplicate without a follow-up
SELECT.
E) Wiring
- core/services/shared/events/nats.go — minimal NATS JetStream
publisher + subscriber surface; legacy RedPanda producer/consumer
in events.go untouched per [Q-mine-3].
- core/services/billing/main.go — NATS_URL env; subscriber wired
in parallel with the existing RedPanda tenant-events consumer.
- middleware/jwt.go — exported test helper WithClaims so handler
tests can construct an authenticated context without minting a
real signed token.
- .github/workflows/services-build.yaml — metering-sidecar added
to the build matrix; deploy job skips it (image consumed by the
bp-newapi chart, not products/catalyst sme-services).
F) bp-newapi chart (1.0.0 → 1.1.0)
- meteringSidecar block in values.yaml: image, port, NATS URL,
priceMicroOMRPerToken (default 156 = 0.000156 OMR/token), spool
dir, header names, resources, securityContext (read-only-rootfs).
- deployment.yaml renders the sidecar container + emptyDir spool
volume when meteringSidecar.enabled (default true).
- service.yaml routes the cluster-fronting :3000 to the sidecar
when enabled, exposes a separate :3001 → NewAPI direct port for
bp-catalyst-platform admin-API traffic (ADR-0003 §3.2).
- networkpolicy.yaml allows the sidecar's port + nats-system
egress for JetStream publish.
Tests: 33 new (14 sidecar + 11 subscriber + 8 HTTP handler), all green.
Helm template renders cleanly with sidecar enabled and disabled.
Closes#798
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(billing/store): cast SUM to BIGINT so lib/pq scans into int64 (#798)
Postgres returns `SUM(int) + SUM(bigint)/integer` as `numeric`, which
lib/pq presents as a `[]uint8` decimal string ("50.000000000000000000000000")
that does NOT scan directly into Go int64 — the integration test
TestVoucherLifecycle_IssueRedeemAndCreditApplied caught this in CI on
the post-redeem balance read.
Wrap the SUM expressions in CAST(... AS BIGINT) so the column type is
unambiguously bigint and Scan target stays uniform across pre-#798 rows
(amount_omr only) and post-#798 rows (amount_micro_omr present).
Affects:
- GetCreditBalance
- GetCreditBalanceMicroOMR
- RecordUsage's running-balance read
Test mocks updated to match the new SQL prefix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>