openova/tools/qa-loop
e3mrah 4dd4150d16
feat(qa-loop): tier-scoped test-session endpoint + canonical PW runner (iter-11 Fix #46) (#1266)
* feat(qa-loop): tier-scoped test-session endpoint + canonical PW runner (iter-11 Fix #46)

Two coupled changes for the 5-agent QA team Test Executor:

Cluster-A — POST /api/v1/auth/test-session?tier=<tier> in catalyst-api
mints session cookies for synthetic qa-test-{tier}@openova.io users
across all 5 tiers (viewer/developer/operator/admin/owner). PIN-via-IMAP
always lands tier=owner (the inbox is the owner's), so the matrix's ~37
tier-boundary 403/200 rows mis-fired every iteration. Endpoint is gated
by env CATALYST_TEST_SESSION_ENABLED — default empty/false → 404 Not
Found, indistinguishable from a missing route on production Sovereigns.
qaFixtures.testSessionEnabled chart value sets the env; bootstrap-kit
defaults this to "true" on QA Sovereigns (QA_TEST_SESSION_ENABLED:-true).

Adds 5 UserAccess CRs (qa-test-{viewer,developer,operator,admin,owner})
via templates/qa-fixtures/useraccess-qa-test-tiers.yaml so the
useraccess-controller binds each synthetic user to its canonical tier
role. Gated on AND of qaFixtures.enabled + qaFixtures.testSessionEnabled.

Cluster-B — Canonical Playwright runner at tools/qa-loop/playwright-runner.js
with nav-interrupted recovery: catches "page.goto: Navigation ...
interrupted by another navigation" exceptions thrown when SPA route guards
redirect mid-goto, settles on the final URL, and re-runs the matrix's
must_contain assertions against the recovered body. Iter-10/11 lost ~32
rows to this exception. Rows that bounce to /login surface a diagnostic
"auth-redirect: cookie missing or expired" reason instead of a thrown
exception so the Coordinator re-mints + re-runs cleanly. Future qa-loop
iterations dispatch this runner instead of inventing a new
/tmp/iterN/playwright-runner.js each cycle.

Per feedback_no_mvp_no_workarounds.md both changes are target-state
(real, gated, complete), NOT stubs:
  - The endpoint mints a real JWT via the same handover signer the PIN
    flow uses; the JWT carries tier + realm_access.roles + qa_test_session
    audit-log discriminator.
  - The runner handles every nav-error class observed on omantel-chroot
    with Playwright resolution searching well-known locations.

Bumps bp-catalyst-platform 1.4.116 → 1.4.117.

Closes most of the 277 FAILs in iter-11 by unblocking the tier-boundary
contract and the PW nav-interrupted class.

Tests:
  - 14 new unit tests in auth_test_session_test.go (disabled→404,
    enabled+5 tiers happy path, missing/bad tier, signer absent,
    body overrides). All PASS.
  - helm lint + helm template render verified for both
    qaFixtures.enabled=false (default) and =true paths.
  - JS syntax + nav-interrupted pattern matching against actual
    iter-11 errors verified.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(chart): use single-token Helm directive for CATALYST_TEST_SESSION_ENABLED

The strategy-flip-regression test runs `kubectl apply --dry-run=server`
on the raw api-deployment.yaml template (no Helm render), so any
`value:` field MUST be a YAML scalar that Go YAML can parse. Helm
directives that contain literal "double-quoted" strings inside the
braces break the parse — kubectl errors with 'did not find expected
key' on line 924.

Replace the if/else+literal-strings shape with the same single-token
pattern the existing KEYCLOAK_BOOTSTRAP_TIER_ROLES line uses (line 526):

  value: {{ <expression> | quote }}

The expression `(and .Values.qaFixtures .Values.qaFixtures.testSessionEnabled
| default false | toString)` evaluates to "true" or "false" then `| quote`
wraps in YAML-safe double-quotes. Renders to value: "true" when both
qaFixtures.enabled AND qaFixtures.testSessionEnabled are true; "false"
otherwise. The Go handler in handler/auth_test_session.go treats
anything other than "true"/"1"/"yes" as disabled, so the wire behavior
is identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 07:40:44 +04:00
..
playwright-runner.js feat(qa-loop): tier-scoped test-session endpoint + canonical PW runner (iter-11 Fix #46) (#1266) 2026-05-10 07:40:44 +04:00
README.md feat(qa-loop): tier-scoped test-session endpoint + canonical PW runner (iter-11 Fix #46) (#1266) 2026-05-10 07:40:44 +04:00

tools/qa-loop

Canonical helpers for the 5-agent QA team's Test Executor role. Each file in this directory is the SINGLE-SOURCE-OF-TRUTH replacement for ad-hoc /tmp/iterN/* scripts that previous qa-loop iterations re-invented from scratch every cycle.

When a future iteration discovers a Test-Executor pattern that should become canon, the rule is: commit it here under tools/qa-loop/, not under /tmp/iterN/.

Files

playwright-runner.js

Drives a single headless Chromium against every executor_method == "playwright" row in a matrix JSON file. Memory-conscious (one browser, one context, one page reused across all rows — under ~2 GiB RSS so the Coordinator can hold parallel Fix Authors per feedback_machine_saturation_3rd_violation.md).

The key feature is nav-interrupted recovery (qa-loop iter-11 Cluster-B fix). The SPA's React route guard often pushes the operator to /login or /provision/<id>/<page> mid-page.goto, which Playwright surfaces as:

Error: page.goto: Navigation to "https://X" is interrupted by
       another navigation to "https://Y"

Iter-10/11 lost ~32 rows to this thrown exception. The new runner catches the recoverable subclass of nav errors, settles on the final URL, and re-runs the matrix's must_contain / must_not_contain assertions against the recovered body. Rows that bounced to /login get a diagnostic auth-redirect: reason (cookie missing or expired) so the Coordinator can re-mint and re-run instead of treating them as code bugs.

Usage

node tools/qa-loop/playwright-runner.js \
  --matrix=/path/to/test-matrix.json \
  --cookies=/path/to/cookies.txt \
  --out=/tmp/iter-N-pw-results.jsonl \
  --progress=/tmp/iter-N-progress.log \
  [--filter-category=resources] \
  [--filter-tier=viewer] \
  [--deployment-id=sovereign-omantel.biz] \
  [--timeout-ms=25000] \
  [--networkidle-ms=4000] \
  [--settle-ms=800] \
  [--headed]

The runner emits one JSONL line per test row with these fields:

{
  "id": "TC-226",
  "category": "resources",
  "method": "playwright",
  "url": "https://...",
  "verdict": "PASS",
  "reason": "ok",
  "http_code": 200,
  "body_preview": "...",
  "final_url": "https://...",
  "recovered_from_nav_interrupt": false
}

final_url and recovered_from_nav_interrupt are new in iter-11 — they let the Coordinator distinguish "the SPA bounced but landed on the right page" (recovered=true, verdict=PASS) from "the SPA bounced to /login because the cookie expired" (recovered=true, verdict=FAIL, reason starts with auth-redirect:).

Tier-scoped runs (qa-loop iter-11 Cluster-A)

Pair this runner with the new POST /api/v1/auth/test-session endpoint (catalyst-api auth_test_session.go) to assert the matrix's tier-boundary 403/200 contract:

# 1. Mint a viewer-tier session into a fresh cookie jar
curl -fsS -c /tmp/viewer-cookies.txt -X POST \
  https://console.omantel.biz/api/v1/auth/test-session?tier=viewer

# 2. Run only the viewer-tier rows of the matrix
node tools/qa-loop/playwright-runner.js \
  --matrix=/path/to/test-matrix.json \
  --cookies=/tmp/viewer-cookies.txt \
  --filter-tier=viewer \
  --out=/tmp/iter-12-viewer-results.jsonl

The endpoint is gated by CATALYST_TEST_SESSION_ENABLED=true and returns 404 on production Sovereigns, so this flow only runs on QA / chroot Sovereigns where qaFixtures.testSessionEnabled: true.