Test Report — MangaAssist (Streamlit + 3D + ACP)

Date: 2026-04-30 Coverage: 110 tests across 4 test layers, 100% pass after one bug fix.

Summary

Layer	Tool	Tests	Pass	Notes
Frontend (3D, browser)	Playwright (Chromium)	56	56	Headless, real WebGL via SwiftShader
Backend (Python smoke)	smoke_test.py	14	14	Catalog, MCP, guardrails, eval, atlas, traces
Backend (game prototypes)	smoke_game_prototypes.py	19	19	Atlas + Pokédex flows, edge cases
API (FastAPI ACP)	api_tests_acp.py	21	21	Discovery, products, cart, checkout, validation
Total		110	110

Bug found and fixed during testing

File	Bug	Fix
mangaassist_3d/src/scenes/Lobby.tsx	"First Step" trophy auto-unlocks on player spawn (player spawns at `(0,0,4)` so `Math.hypot(0,4) = 4 > 1.5`)	Track initial position via `useRef`; only fire after `distanceTo(spawn) > 1.5`
mangaassist_3d/src/components/Player.tsx + App.tsx	V-key view toggle had stale closure inside R3F's async render loop — second press was no-op	Move global key handlers (V, B) to `App.tsx` outside the Canvas; use functional `setView(prev => …)`

Layer 1: Playwright (frontend, 3D)

Config: playwright.config.ts — Chromium with --use-gl=angle --use-angle=swiftshader for real WebGL in headless mode. webServer block auto-starts npm run preview on port 4173 if not already running.

Spec	Tests	What it covers
01-smoke.spec.ts	6	Page title, start overlay visible/dismisses, canvas dimensions, WebGL context exists, no console errors
02-hud.spec.ts	7	Scene title, control hints (W/A/S/D/Shift/Space/V/B), top-right counters, trophy counter, nav buttons, crosshair only in FP
03-routing.spec.ts	10	Atlas/Pokédex button nav, Lobby return, B key, V key toggle, "current scene" button hiding
04-persistence.spec.ts	6	Clean state shows zeros, seeded loved/deck/trophies reflected in HUD, persists across reload, malformed JSON ignored
05-detail-card.spec.ts	4	Chat panel scoped to MangaBot, detail modal not present by default, Pokédex deck pedestal renders, Atlas canvas interactive
06-visual.spec.ts	3	Visual-regression screenshot baselines for Lobby, Atlas, Pokédex (8% pixel-ratio tolerance)
07-responsive.spec.ts	8	1920×1080, 1366×768, 1024×768, 768×600 — HUD elements visible + within viewport bounds
08-keyboard.spec.ts	6	V toggle (TP↔FP), B key behavior, WASD no-throw, Shift+W (run) no-throw, Space (jump) no-throw, unrelated keys ignored
09-accessibility.spec.ts	6	`<html lang>`, favicon link, viewport meta, button text content present, HUD pointer-events policy, start overlay doesn't trap input

Run command:

cd mangaassist_3d
npx playwright test

Run time: ~2 min 22s for the full suite. HTML report is generated under mangaassist_3d/playwright-report/.

Layer 2: Streamlit smoke (backend)

streamlit_app/scripts/smoke_test.py — 14 steps:

[PASS] seed catalog (force=True)
[PASS] catalog search filters
[PASS] mcp tools direct call
[PASS] guardrails red-team suite
[PASS] visual index (hash backend) + search
[PASS] embedding atlas build
[PASS] trace_store roundtrip
[PASS] eval golden-set load
[PASS] compile-check every page + surface
[PASS] orchestrator action-intent regex coverage
[PASS] anime->manga reading-order lookup (curated graph)
[PASS] affiliate URL builder (no creds = search URL)
[PASS] acceptance-rate funnel (shown + clicked)
[PASS] orchestrator spoiler-policy switch is wired

Coverage is the underlying business logic: catalog, MCP tools (anime↔manga, recommend, policy, cart), guardrails (red-team + PII), embeddings, atlas projection, trace store CRUD, eval golden-set load, all-page compile, orchestrator's action-intent regex, affiliate URL builder, acceptance-rate funnel, spoiler policy.

Layer 3: Game-prototype deep smoke (backend)

streamlit_app/scripts/smoke_game_prototypes.py — 19 steps:

[PASS] clean prior test state
[PASS] atlas page renders without error and seeds 3 starter tiles
[PASS] atlas: clicking Loved on a starter tile unlocks first_loved trophy
[PASS] atlas: + Deck button adds card and persists
[PASS] atlas: scout_suggest changes after revealing the suggested tile
[PASS] pokedex page renders with state shared from atlas
[PASS] pokedex: filling deck to 3 cards unlocks deck_built trophy
[PASS] pokedex: filtering 'Loved' shows only the loved card
[PASS] recommend_from_deck: empty deck -> empty recs
[PASS] recommend_from_deck: nonexistent sku ids -> empty recs (no crash)
[PASS] DECK_MAX cap is enforced
[PASS] invalid status raises ValueError
[PASS] reset_user wipes everything
[PASS] atlas page renders cleanly with EMPTY state (post-reset)
[PASS] atlas page renders with FULL state (every SKU touched)
[PASS] pokedex renders with FULL deck (5 cards)
[PASS] trophies are idempotent — unlocking twice doesn't duplicate
[PASS] safe_page_link: works without ScriptRunContext
[PASS] final cleanup

These tests exercise the Streamlit pages in headless AppTest mode and verify deep behaviors: trophy logic, deck cap enforcement, idempotency, edge-case state (empty / full atlas), invalid input handling, full reset flows.

Layer 4: API tests (FastAPI ACP server)

streamlit_app/scripts/api_tests_acp.py — 21 steps against the live ACP server on http://localhost:8000:

[PASS] healthz returns 200 + catalog size > 0
[PASS] discovery manifest has required shape
[PASS] GET /v1/products returns 12 by default
[PASS] GET /v1/products?q=demon filters by query
[PASS] GET /v1/products?completed_only=true respects flag
[PASS] GET /v1/products/{sku_id}: 200 for known
[PASS] GET /v1/products/{sku_id}: 404 for unknown
[PASS] POST /v1/cart creates a cart with computed subtotal
[PASS] POST /v1/cart with multiple items computes subtotal correctly
[PASS] POST /v1/cart with unknown sku returns 404
[PASS] GET /v1/cart/{cart_id} returns persisted cart
[PASS] GET /v1/cart/{unknown} returns 404
[PASS] POST /v1/checkout requires explicit confirm=true
[PASS] POST /v1/checkout with confirm=true succeeds and redacts email
[PASS] POST /v1/checkout with unknown cart returns 404
[PASS] GET /v1/checkout/{id} returns the checkout record
[PASS] GET /v1/checkout/{unknown} returns 404
[PASS] error responses are JSON, not HTML
[PASS] POST /v1/cart with empty items list still succeeds with subtotal=0
[PASS] POST /v1/cart rejects qty < 1 (Pydantic validation)
[PASS] POST /v1/cart rejects qty > 99 (Pydantic validation)

Notably verified: - ACP/UCP discovery manifest at /.well-known/agent-commerce.json advertises both protocols (acp/0.1, ucp/0.1) - The voice-action safety rule from roadmap §6.4 (confirm: true required for checkout) is enforced — confirm: false returns 400 - PII redaction in checkout (alice@example.com → a***@example.com) - Pydantic validation rejects qty < 1 and qty > 99 with HTTP 422

Start command:

python -m uvicorn streamlit_app.surfaces.acp_server:app --port 8000
python streamlit_app/scripts/api_tests_acp.py

What was NOT tested (and why)

Area	Why skipped	Suggested next step
Real R3F mesh raycasting (clicking a 3D card-disc with mouse coords)	Camera-relative; hard to test deterministically without exposing test hooks	Add `data-testid` attributes on critical 3D meshes via R3F's HTML hover helpers
Touch / mobile gestures	The 3D app uses mouse-drag; no touch path implemented yet	Add a virtual-joystick component, then test on `--device "iPhone 14"`
Audio (footstep, NPC voice)	Not implemented	Once added, use Playwright's `audio` API or media-stream interception
Streamlit page-level Playwright	Streamlit's WebSocket-driven UI is awkward for Playwright; covered by `AppTest` instead	If needed, use Streamlit's own `--server.runOnSave` with Playwright helpers
Performance / FPS in 3D	Covered loosely by visual regression timing; no formal FPS test	Add `requestAnimationFrame` polling test or use `chrome.tracing`
Load/stress testing on FastAPI	One-shot API tests only	`locust` or `k6` against the ACP server
WebGL on SwiftShader vs real GPU	Visual snapshots have 8% tolerance	Run a `--project firefox` or `--project webkit` matrix to catch driver-specific renders

How to re-run everything

# Terminal 1 — keep the 3D preview server running
cd mangaassist_3d && npm run preview -- --port 4173

# Terminal 2 — keep the FastAPI ACP server running
python -m uvicorn streamlit_app.surfaces.acp_server:app --port 8000

# Terminal 3 — run all 4 layers
cd mangaassist_3d && npx playwright test                    # Layer 1: 56 tests
python streamlit_app/scripts/smoke_test.py                  # Layer 2: 14 tests
python streamlit_app/scripts/smoke_game_prototypes.py       # Layer 3: 19 tests
python streamlit_app/scripts/api_tests_acp.py               # Layer 4: 21 tests

Total runtime end-to-end: ~3.5 minutes.

Interpretation

The 3D app, the Streamlit pages, the shared game state, and the ACP server are all production-equivalent at the smoke-test level: every entry point is hit at least once, every error path verified, every persistence channel roundtripped, every safety rule (voice-action confirm, PII redaction, deck cap, qty validation) enforced.

The two real bugs caught in this session — the auto-unlocking First Step trophy and the stale-closure V-key toggle — were both state-management edge cases that wouldn't show up under casual play but would surface within minutes once a user explored. They're worth catching now.

What this report does not establish: visual fidelity (screenshots have a generous 8% tolerance), performance under real GPU pressure, behavior on second-tier browsers, multi-session concurrency for the ACP server. Those are the next layer of test investment.