Test Report — MangaAssist (Streamlit + 3D + ACP)
Date: 2026-04-30 Coverage: 110 tests across 4 test layers, 100% pass after one bug fix.
Summary
| Layer | Tool | Tests | Pass | Fail | Notes |
|---|---|---|---|---|---|
| Frontend (3D, browser) | Playwright (Chromium) | 56 | 56 | 0 | Headless, real WebGL via SwiftShader |
| Backend (Python smoke) | smoke_test.py | 14 | 14 | 0 | Catalog, MCP, guardrails, eval, atlas, traces |
| Backend (game prototypes) | smoke_game_prototypes.py | 19 | 19 | 0 | Atlas + Pokédex flows, edge cases |
| API (FastAPI ACP) | api_tests_acp.py | 21 | 21 | 0 | Discovery, products, cart, checkout, validation |
| Total | 110 | 110 | 0 |
Bug found and fixed during testing
| File | Bug | Fix |
|---|---|---|
| mangaassist_3d/src/scenes/Lobby.tsx | "First Step" trophy auto-unlocks on player spawn (player spawns at (0,0,4) so Math.hypot(0,4) = 4 > 1.5) |
Track initial position via useRef; only fire after distanceTo(spawn) > 1.5 |
| mangaassist_3d/src/components/Player.tsx + App.tsx | V-key view toggle had stale closure inside R3F's async render loop — second press was no-op | Move global key handlers (V, B) to App.tsx outside the Canvas; use functional setView(prev => …) |
Layer 1: Playwright (frontend, 3D)
Config: playwright.config.ts — Chromium with --use-gl=angle --use-angle=swiftshader for real WebGL in headless mode. webServer block auto-starts npm run preview on port 4173 if not already running.
| Spec | Tests | What it covers |
|---|---|---|
| 01-smoke.spec.ts | 6 | Page title, start overlay visible/dismisses, canvas dimensions, WebGL context exists, no console errors |
| 02-hud.spec.ts | 7 | Scene title, control hints (W/A/S/D/Shift/Space/V/B), top-right counters, trophy counter, nav buttons, crosshair only in FP |
| 03-routing.spec.ts | 10 | Atlas/Pokédex button nav, Lobby return, B key, V key toggle, "current scene" button hiding |
| 04-persistence.spec.ts | 6 | Clean state shows zeros, seeded loved/deck/trophies reflected in HUD, persists across reload, malformed JSON ignored |
| 05-detail-card.spec.ts | 4 | Chat panel scoped to MangaBot, detail modal not present by default, Pokédex deck pedestal renders, Atlas canvas interactive |
| 06-visual.spec.ts | 3 | Visual-regression screenshot baselines for Lobby, Atlas, Pokédex (8% pixel-ratio tolerance) |
| 07-responsive.spec.ts | 8 | 1920×1080, 1366×768, 1024×768, 768×600 — HUD elements visible + within viewport bounds |
| 08-keyboard.spec.ts | 6 | V toggle (TP↔FP), B key behavior, WASD no-throw, Shift+W (run) no-throw, Space (jump) no-throw, unrelated keys ignored |
| 09-accessibility.spec.ts | 6 | <html lang>, favicon link, viewport meta, button text content present, HUD pointer-events policy, start overlay doesn't trap input |
Run command:
cd mangaassist_3d
npx playwright test
Run time: ~2 min 22s for the full suite.
HTML report is generated under mangaassist_3d/playwright-report/.
Layer 2: Streamlit smoke (backend)
streamlit_app/scripts/smoke_test.py — 14 steps:
[PASS] seed catalog (force=True)
[PASS] catalog search filters
[PASS] mcp tools direct call
[PASS] guardrails red-team suite
[PASS] visual index (hash backend) + search
[PASS] embedding atlas build
[PASS] trace_store roundtrip
[PASS] eval golden-set load
[PASS] compile-check every page + surface
[PASS] orchestrator action-intent regex coverage
[PASS] anime->manga reading-order lookup (curated graph)
[PASS] affiliate URL builder (no creds = search URL)
[PASS] acceptance-rate funnel (shown + clicked)
[PASS] orchestrator spoiler-policy switch is wired
Coverage is the underlying business logic: catalog, MCP tools (anime↔manga, recommend, policy, cart), guardrails (red-team + PII), embeddings, atlas projection, trace store CRUD, eval golden-set load, all-page compile, orchestrator's action-intent regex, affiliate URL builder, acceptance-rate funnel, spoiler policy.
Layer 3: Game-prototype deep smoke (backend)
streamlit_app/scripts/smoke_game_prototypes.py — 19 steps:
[PASS] clean prior test state
[PASS] atlas page renders without error and seeds 3 starter tiles
[PASS] atlas: clicking Loved on a starter tile unlocks first_loved trophy
[PASS] atlas: + Deck button adds card and persists
[PASS] atlas: scout_suggest changes after revealing the suggested tile
[PASS] pokedex page renders with state shared from atlas
[PASS] pokedex: filling deck to 3 cards unlocks deck_built trophy
[PASS] pokedex: filtering 'Loved' shows only the loved card
[PASS] recommend_from_deck: empty deck -> empty recs
[PASS] recommend_from_deck: nonexistent sku ids -> empty recs (no crash)
[PASS] DECK_MAX cap is enforced
[PASS] invalid status raises ValueError
[PASS] reset_user wipes everything
[PASS] atlas page renders cleanly with EMPTY state (post-reset)
[PASS] atlas page renders with FULL state (every SKU touched)
[PASS] pokedex renders with FULL deck (5 cards)
[PASS] trophies are idempotent — unlocking twice doesn't duplicate
[PASS] safe_page_link: works without ScriptRunContext
[PASS] final cleanup
These tests exercise the Streamlit pages in headless AppTest mode and verify deep behaviors: trophy logic, deck cap enforcement, idempotency, edge-case state (empty / full atlas), invalid input handling, full reset flows.
Layer 4: API tests (FastAPI ACP server)
streamlit_app/scripts/api_tests_acp.py — 21 steps against the live ACP server on http://localhost:8000:
[PASS] healthz returns 200 + catalog size > 0
[PASS] discovery manifest has required shape
[PASS] GET /v1/products returns 12 by default
[PASS] GET /v1/products?q=demon filters by query
[PASS] GET /v1/products?completed_only=true respects flag
[PASS] GET /v1/products/{sku_id}: 200 for known
[PASS] GET /v1/products/{sku_id}: 404 for unknown
[PASS] POST /v1/cart creates a cart with computed subtotal
[PASS] POST /v1/cart with multiple items computes subtotal correctly
[PASS] POST /v1/cart with unknown sku returns 404
[PASS] GET /v1/cart/{cart_id} returns persisted cart
[PASS] GET /v1/cart/{unknown} returns 404
[PASS] POST /v1/checkout requires explicit confirm=true
[PASS] POST /v1/checkout with confirm=true succeeds and redacts email
[PASS] POST /v1/checkout with unknown cart returns 404
[PASS] GET /v1/checkout/{id} returns the checkout record
[PASS] GET /v1/checkout/{unknown} returns 404
[PASS] error responses are JSON, not HTML
[PASS] POST /v1/cart with empty items list still succeeds with subtotal=0
[PASS] POST /v1/cart rejects qty < 1 (Pydantic validation)
[PASS] POST /v1/cart rejects qty > 99 (Pydantic validation)
Notably verified:
- ACP/UCP discovery manifest at /.well-known/agent-commerce.json advertises both protocols (acp/0.1, ucp/0.1)
- The voice-action safety rule from roadmap §6.4 (confirm: true required for checkout) is enforced — confirm: false returns 400
- PII redaction in checkout (alice@example.com → a***@example.com)
- Pydantic validation rejects qty < 1 and qty > 99 with HTTP 422
Start command:
python -m uvicorn streamlit_app.surfaces.acp_server:app --port 8000
python streamlit_app/scripts/api_tests_acp.py
What was NOT tested (and why)
| Area | Why skipped | Suggested next step |
|---|---|---|
| Real R3F mesh raycasting (clicking a 3D card-disc with mouse coords) | Camera-relative; hard to test deterministically without exposing test hooks | Add data-testid attributes on critical 3D meshes via R3F's HTML hover helpers |
| Touch / mobile gestures | The 3D app uses mouse-drag; no touch path implemented yet | Add a virtual-joystick component, then test on --device "iPhone 14" |
| Audio (footstep, NPC voice) | Not implemented | Once added, use Playwright's audio API or media-stream interception |
| Streamlit page-level Playwright | Streamlit's WebSocket-driven UI is awkward for Playwright; covered by AppTest instead |
If needed, use Streamlit's own --server.runOnSave with Playwright helpers |
| Performance / FPS in 3D | Covered loosely by visual regression timing; no formal FPS test | Add requestAnimationFrame polling test or use chrome.tracing |
| Load/stress testing on FastAPI | One-shot API tests only | locust or k6 against the ACP server |
| WebGL on SwiftShader vs real GPU | Visual snapshots have 8% tolerance | Run a --project firefox or --project webkit matrix to catch driver-specific renders |
How to re-run everything
# Terminal 1 — keep the 3D preview server running
cd mangaassist_3d && npm run preview -- --port 4173
# Terminal 2 — keep the FastAPI ACP server running
python -m uvicorn streamlit_app.surfaces.acp_server:app --port 8000
# Terminal 3 — run all 4 layers
cd mangaassist_3d && npx playwright test # Layer 1: 56 tests
python streamlit_app/scripts/smoke_test.py # Layer 2: 14 tests
python streamlit_app/scripts/smoke_game_prototypes.py # Layer 3: 19 tests
python streamlit_app/scripts/api_tests_acp.py # Layer 4: 21 tests
Total runtime end-to-end: ~3.5 minutes.
Interpretation
The 3D app, the Streamlit pages, the shared game state, and the ACP server are all production-equivalent at the smoke-test level: every entry point is hit at least once, every error path verified, every persistence channel roundtripped, every safety rule (voice-action confirm, PII redaction, deck cap, qty validation) enforced.
The two real bugs caught in this session — the auto-unlocking First Step trophy and the stale-closure V-key toggle — were both state-management edge cases that wouldn't show up under casual play but would surface within minutes once a user explored. They're worth catching now.
What this report does not establish: visual fidelity (screenshots have a generous 8% tolerance), performance under real GPU pressure, behavior on second-tier browsers, multi-session concurrency for the ACP server. Those are the next layer of test investment.