* feat(qa-loop): tier-scoped test-session endpoint + canonical PW runner (iter-11 Fix #46) Two coupled changes for the 5-agent QA team Test Executor: Cluster-A — POST /api/v1/auth/test-session?tier=<tier> in catalyst-api mints session cookies for synthetic qa-test-{tier}@openova.io users across all 5 tiers (viewer/developer/operator/admin/owner). PIN-via-IMAP always lands tier=owner (the inbox is the owner's), so the matrix's ~37 tier-boundary 403/200 rows mis-fired every iteration. Endpoint is gated by env CATALYST_TEST_SESSION_ENABLED — default empty/false → 404 Not Found, indistinguishable from a missing route on production Sovereigns. qaFixtures.testSessionEnabled chart value sets the env; bootstrap-kit defaults this to "true" on QA Sovereigns (QA_TEST_SESSION_ENABLED:-true). Adds 5 UserAccess CRs (qa-test-{viewer,developer,operator,admin,owner}) via templates/qa-fixtures/useraccess-qa-test-tiers.yaml so the useraccess-controller binds each synthetic user to its canonical tier role. Gated on AND of qaFixtures.enabled + qaFixtures.testSessionEnabled. Cluster-B — Canonical Playwright runner at tools/qa-loop/playwright-runner.js with nav-interrupted recovery: catches "page.goto: Navigation ... interrupted by another navigation" exceptions thrown when SPA route guards redirect mid-goto, settles on the final URL, and re-runs the matrix's must_contain assertions against the recovered body. Iter-10/11 lost ~32 rows to this exception. Rows that bounce to /login surface a diagnostic "auth-redirect: cookie missing or expired" reason instead of a thrown exception so the Coordinator re-mints + re-runs cleanly. Future qa-loop iterations dispatch this runner instead of inventing a new /tmp/iterN/playwright-runner.js each cycle. Per feedback_no_mvp_no_workarounds.md both changes are target-state (real, gated, complete), NOT stubs: - The endpoint mints a real JWT via the same handover signer the PIN flow uses; the JWT carries tier + realm_access.roles + qa_test_session audit-log discriminator. - The runner handles every nav-error class observed on omantel-chroot with Playwright resolution searching well-known locations. Bumps bp-catalyst-platform 1.4.116 → 1.4.117. Closes most of the 277 FAILs in iter-11 by unblocking the tier-boundary contract and the PW nav-interrupted class. Tests: - 14 new unit tests in auth_test_session_test.go (disabled→404, enabled+5 tiers happy path, missing/bad tier, signer absent, body overrides). All PASS. - helm lint + helm template render verified for both qaFixtures.enabled=false (default) and =true paths. - JS syntax + nav-interrupted pattern matching against actual iter-11 errors verified. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(chart): use single-token Helm directive for CATALYST_TEST_SESSION_ENABLED The strategy-flip-regression test runs `kubectl apply --dry-run=server` on the raw api-deployment.yaml template (no Helm render), so any `value:` field MUST be a YAML scalar that Go YAML can parse. Helm directives that contain literal "double-quoted" strings inside the braces break the parse — kubectl errors with 'did not find expected key' on line 924. Replace the if/else+literal-strings shape with the same single-token pattern the existing KEYCLOAK_BOOTSTRAP_TIER_ROLES line uses (line 526): value: {{ <expression> | quote }} The expression `(and .Values.qaFixtures .Values.qaFixtures.testSessionEnabled | default false | toString)` evaluates to "true" or "false" then `| quote` wraps in YAML-safe double-quotes. Renders to value: "true" when both qaFixtures.enabled AND qaFixtures.testSessionEnabled are true; "false" otherwise. The Go handler in handler/auth_test_session.go treats anything other than "true"/"1"/"yes" as disabled, so the wire behavior is identical. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| playwright-runner.js | ||
| README.md | ||
tools/qa-loop
Canonical helpers for the 5-agent QA team's Test Executor role. Each
file in this directory is the SINGLE-SOURCE-OF-TRUTH replacement for
ad-hoc /tmp/iterN/* scripts that previous qa-loop iterations
re-invented from scratch every cycle.
When a future iteration discovers a Test-Executor pattern that should
become canon, the rule is: commit it here under
tools/qa-loop/, not under /tmp/iterN/.
Files
playwright-runner.js
Drives a single headless Chromium against every executor_method == "playwright" row in a matrix JSON file. Memory-conscious (one
browser, one context, one page reused across all rows — under
~2 GiB RSS so the Coordinator can hold parallel Fix Authors per
feedback_machine_saturation_3rd_violation.md).
The key feature is nav-interrupted recovery (qa-loop iter-11
Cluster-B fix). The SPA's React route guard often pushes the
operator to /login or /provision/<id>/<page> mid-page.goto,
which Playwright surfaces as:
Error: page.goto: Navigation to "https://X" is interrupted by
another navigation to "https://Y"
Iter-10/11 lost ~32 rows to this thrown exception. The new runner
catches the recoverable subclass of nav errors, settles on the
final URL, and re-runs the matrix's must_contain /
must_not_contain assertions against the recovered body. Rows
that bounced to /login get a diagnostic auth-redirect: reason
(cookie missing or expired) so the Coordinator can re-mint and
re-run instead of treating them as code bugs.
Usage
node tools/qa-loop/playwright-runner.js \
--matrix=/path/to/test-matrix.json \
--cookies=/path/to/cookies.txt \
--out=/tmp/iter-N-pw-results.jsonl \
--progress=/tmp/iter-N-progress.log \
[--filter-category=resources] \
[--filter-tier=viewer] \
[--deployment-id=sovereign-omantel.biz] \
[--timeout-ms=25000] \
[--networkidle-ms=4000] \
[--settle-ms=800] \
[--headed]
The runner emits one JSONL line per test row with these fields:
{
"id": "TC-226",
"category": "resources",
"method": "playwright",
"url": "https://...",
"verdict": "PASS",
"reason": "ok",
"http_code": 200,
"body_preview": "...",
"final_url": "https://...",
"recovered_from_nav_interrupt": false
}
final_url and recovered_from_nav_interrupt are new in iter-11 —
they let the Coordinator distinguish "the SPA bounced but landed on
the right page" (recovered=true, verdict=PASS) from "the SPA bounced
to /login because the cookie expired" (recovered=true, verdict=FAIL,
reason starts with auth-redirect:).
Tier-scoped runs (qa-loop iter-11 Cluster-A)
Pair this runner with the new POST /api/v1/auth/test-session
endpoint (catalyst-api auth_test_session.go) to assert the
matrix's tier-boundary 403/200 contract:
# 1. Mint a viewer-tier session into a fresh cookie jar
curl -fsS -c /tmp/viewer-cookies.txt -X POST \
https://console.omantel.biz/api/v1/auth/test-session?tier=viewer
# 2. Run only the viewer-tier rows of the matrix
node tools/qa-loop/playwright-runner.js \
--matrix=/path/to/test-matrix.json \
--cookies=/tmp/viewer-cookies.txt \
--filter-tier=viewer \
--out=/tmp/iter-12-viewer-results.jsonl
The endpoint is gated by CATALYST_TEST_SESSION_ENABLED=true and
returns 404 on production Sovereigns, so this flow only runs on
QA / chroot Sovereigns where qaFixtures.testSessionEnabled: true.