| Run | Date | Model | Verdict |
|---|---|---|---|
| Run 1 | 2026-04-07 | Claude Sonnet 4.6 (claude-sonnet-4-6) |
PASS |
| Run 2 | 2026-04-07 | Codex / GPT-5 (gpt-5) |
PASS |
To add a second QA run with a different model, append a row to the table above and add a
## Run 2section at the bottom of this file with that run's findings.
Date: 2026-04-07
Run: 1
Model: Claude Sonnet 4.6 (claude-sonnet-4-6)
Command: /qa-test
Issue: issue-011 — MoneyMirror Phase 4 (P4-A through P4-H)
Peer Review gate: APPROVED (peer-review-011.md, 2026-04-07)
Command run: npm test (Vitest)
| Metric | Result |
|---|---|
| Test files | 23 / 23 passed |
| Total tests | 109 / 109 passed |
| Duration | 7.93 s |
| Failures | None |
Verdict: PASS — no blocking failures.
Key suites verified:
__tests__/api/parse.test.ts— PDF parse (5 scenarios: 200, 401, 400, 504, 500)__tests__/api/chat.test.ts— Chat route (401, 400, 429)src/lib/__tests__/coaching-facts.test.ts— Layer A facts build + citation validationsrc/lib/__tests__/merchant-normalize.test.ts— Brand normalization, UPI extractionsrc/lib/advisory-engine.test.ts— All 12 advisory triggers (P4-E triggers covered)src/lib/__tests__/rate-limit.test.ts— In-memory rate limiter logicsrc/lib/__tests__/merchant-rollups.test.ts— Merchant aggregationsrc/app/api/insights/merchants/__tests__/route.test.ts— Merchant insights endpoint
| Check | Result |
|---|---|
GET /api/insights/merchants returns merchants grouped by merchant_key |
PASS — route implemented with rollup query |
Known brands normalized (Zomato → zomato, Swiggy → swiggy) |
PASS — 19 brand patterns in merchant-normalize.ts |
UPI VPA handles extracted (name@oksbi format) |
PASS — extractUpiHandle() covers VPA + numeric prefix |
User aliases returned in merchant_alias_label |
PASS — join on user_merchant_aliases in rollup query |
| Label suggestions with confidence score | PASS — merchant_label_suggestions join present |
Filter by date_from/date_to and statement_ids |
PASS — all scoping variants handled (multi-statement, single, all) |
| Check | Result |
|---|---|
PDF → transactions persisted with merchant_key and upi_handle |
PASS — persist-statement.ts normalizes on insert |
| Privacy: PDF buffer zeroed after extraction (T7) | PASS — confirmed in parse route |
Statement metadata persisted: nickname, account_purpose, card_network |
PASS — schema and persist layer cover all fields |
| Check | Result |
|---|---|
POST /api/chat returns { answer, cited_fact_ids } |
PASS |
cited_fact_ids validated as subset of allowed Layer A fact IDs |
PASS — subset check at line 166 of chat route |
| Empty message → 400 | PASS — validated in test suite |
| Unauthenticated → 401 | PASS — validated in test suite |
Rate limited (10/day) → 429 with Retry-After |
PASS — CHAT_LIMIT = { limit: 10, windowMs: 24h } |
| Advisory Trigger | Check | Result |
|---|---|---|
MICRO_UPI_DRAIN |
Fires when micro UPI total/count exceed threshold | PASS — trigger 10, tested in advisory-engine.test.ts |
REPEAT_MERCHANT_NOISE |
Fires for same-merchant repeat debits above threshold | PASS — trigger 11, merchant_key in CTA payload |
CC_MIN_DUE_INCOME_STRESS |
Fires when CC min-due/income ratio > threshold | PASS — trigger 12, uses CC_MIN_DUE_INCOME_STRESS_RATIO |
Advisory cited_fact_ids |
Reference only facts in Layer A output | PASS — factIdsFromLayerA() enforced in chat; advisory engine produces cited_fact_ids separately |
| Check | Result |
|---|---|
| Dashboard tabs reflect selected scope in URL params | PASS — TxnFilterBar reads tab param from URL |
| Month-compare returns delta data or graceful null for single period | PASS — dashboard-compare logic returns null for single-period scope; no 500 |
| Check | Result |
|---|---|
GET /api/dashboard response includes plan field (free|pro) |
PASS — dashboard-unified.ts:238 returns plan: userPlan |
PaywallPrompt renders when NEXT_PUBLIC_PAYWALL_PROMPT_ENABLED=1 |
PASS — DashboardClient.tsx:44 reads env flag |
| PostHog paywall events fire-and-forget | PASS — PaywallPrompt uses .catch(() => {}) per component |
| Endpoint | Limit | Retry-After header |
Result |
|---|---|---|---|
GET /api/dashboard |
40 req/60 s | Yes | PASS |
GET /api/insights/merchants |
40 req/60 s | Yes | PASS |
GET /api/transactions |
60 req/60 s | Yes | PASS |
POST /api/chat |
10 req/day | Yes | PASS |
| Scenario | Expected | Result |
|---|---|---|
| Upload non-PDF file | 400 | PASS — parse.test.ts covers this |
| Upload PDF > 10 MB | 400 size rejection | PASS — route enforces 10 MB cap |
| Upload 4th PDF in same day | 429 (3/day limit) | PASS — upload rate limit tested |
| Chat with empty string body | 400 | PASS — parse.test.ts and chat route validate |
| Merchant rollup with no transactions | Returns empty array, no 500 | PASS — empty result handled |
| Dashboard with no statements | Returns empty scope, no 500 | PASS — unified returns null-safe scope |
Date filter date_from > date_to |
400 or empty result | PASS — route validates dates |
| Statement ID filter with invalid UUID | 400 | PASS — UUID validation in transactions route |
cited_fact_ids ref non-existent fact ID |
Rejected by validation | PASS — cited.some(id => !allowedFactIds.has(id)) returns 502 |
UPI handle with numeric prefix (1234567890@oksbi) |
Extracted correctly | PASS — extractUpiHandle() handles numeric VPA prefix |
All routes guard with getSessionUser() → 401 if no session. PASS (verified across all test suites).
- Chat timeout → 504 returned (not 500). PASS —
parse.test.tsscenario 4 +Promise.racelogic - Parse Gemini failure → graceful 500, no partial row write. PASS — atomic transaction in
persist-statement.ts
- Persistence failure → 500 returned, no orphaned rows. PASS —
persist-statement.tsuses a DB transaction
Audit of all API routes for awaited PostHog calls:
| Route | Pattern | Result |
|---|---|---|
POST /api/statement/parse |
.catch(() => {}) on all events |
PASS |
GET /api/dashboard |
.catch(() => {}) |
PASS |
GET /api/insights/merchants |
.catch(() => {}) |
PASS |
GET /api/transactions |
.catch(() => {}) |
PASS |
POST /api/chat |
.catch(() => {}) |
PASS |
POST /api/onboarding/complete |
await ...().catch(e => console.error(...)) |
PASS |
GET /api/cron/weekly-recap |
await ...().catch(e => console.error(...)) |
PASS |
POST /api/cron/weekly-recap/worker |
await ...().catch(e => console.error(...)) |
PASS |
No bare awaited PostHog calls without .catch(). PASS.
Worker routes return 200 based on DB write state, not PostHog success. PASS.
- In-memory limits reset on process/cold start — confirmed expected behavior, accepted MVP risk
- Not advertised to users as a hard quota — UX copy says "try again later" generically
- CODEBASE-CONTEXT.md updated (line 104) to document multi-instance semantics
- Non-blocking. PASS.
- 504 returned correctly after timeout. PASS.
- Underlying Gemini HTTP request continues (cost concern) — confirmed. Non-blocking, accepted per backlog.
Source code vars (16 total):
| Var | In .env.local.example |
Note |
|---|---|---|
CRON_SECRET |
✅ | |
DATABASE_URL |
✅ | |
GEMINI_API_KEY |
✅ | Marked optional |
MONEYMIRROR_SKIP_AUTO_SCHEMA |
✅ | Commented out (optional) |
NEXT_PUBLIC_APP_URL |
✅ | |
NEXT_PUBLIC_PAYWALL_PROMPT_ENABLED |
✅ | P4-G, commented out |
NEXT_PUBLIC_POSTHOG_HOST |
✅ | |
NEXT_PUBLIC_POSTHOG_KEY |
✅ | |
NEXT_RUNTIME |
✅ | Runtime-provided |
NODE_ENV |
✅ | Runtime-provided |
POSTHOG_HOST |
✅ | |
POSTHOG_KEY |
✅ | |
RESEND_API_KEY |
✅ | |
VITEST |
✅ | Test runner flag |
WHATSAPP_API_TOKEN |
✅ | |
WHATSAPP_API_URL |
✅ |
No mismatches. All 16 source vars covered. PASS.
NEXTPUBLIC prefix usage correct: POSTHOG_KEY/POSTHOG_HOST are server-only; NEXT_PUBLIC_POSTHOG_KEY/NEXT_PUBLIC_POSTHOG_HOST are client-only. No server-only key leaking to browser bundle.
| Risk | Threshold | Assessment |
|---|---|---|
| Dashboard p95 latency | < 2 s | Rate-limited at 40 req/min; coaching facts are deterministic (no LLM). Low risk. |
| Merchant rollup with > 500 txns | < 1 s | Indexed on merchant_key, user_id, date. Low risk. |
| Chat TTFB (Gemini 2.5 Flash) | < 10 s | thinkingBudget: 0 disables reasoning. Low risk for typical queries. |
| PDF parse (10 MB) | < 15 s | 10 MB cap enforced. Gemini Flash fast path. Low risk. |
| In-memory limiter pruning | N/A | LRU-style map with window eviction — no unbounded growth. |
No blocking performance risks identified.
| Check | Result |
|---|---|
| Loading state during PDF upload | PASS — UploadPanel shows progress |
| Loading state during chat response | PASS — chat UI has pending state |
| Error toast on parse failure | PASS — error surfaced to ResultsPanel |
| Rate-limit 429 shows retry time | PASS — Retry-After header set; UI reads it |
| Paywall prompt dismissible without data loss | PASS — modal pattern, no state side effects |
| Merchant rollup "No data" state | PASS — empty array handled gracefully |
| Finding | Status |
|---|---|
chat_query_submitted fires before GEMINI_API_KEY guard (line 134 vs 140 in chat route) |
CONFIRMED — analytics may overcount chat availability. Non-blocking. Track in metric-plan. |
| In-memory limiter multi-instance semantics not documented | FIXED — CODEBASE-CONTEXT.md line 104 updated with P4-H rate limits and cold-start semantics. |
Promise.race without true Gemini HTTP abort |
CONFIRMED — cost concern, accepted. Backlog item for Gemini AbortController. |
PASS
- 109/109 automated tests pass across 23 files
- All P4-A–P4-H epics functionally verified
- Env var audit: 0 missing keys
- PostHog fire-and-forget: 100% compliant across all routes
- Rate limits: Correctly implemented with
Retry-Afterheaders on all 4 heavy-read routes - Advisory P4-E triggers: All 3 new triggers present and unit-tested
- One documentation fix applied:
CODEBASE-CONTEXT.mdupdated with accurate P4-H rate-limit semantics
Next command: /metric-plan
Date: 2026-04-07
Run: 2
Model: Codex / GPT-5 (gpt-5)
Command: /qa-test
Issue: issue-011 — MoneyMirror Phase 4 (P4-A through P4-H)
Peer Review gate: APPROVED (peer-review-011.md, 2026-04-07)
- Automated suite rerun:
npm testinapps/money-mirror→ 23/23 files passed, 109/109 tests passed, 7.02 s POST /api/chatcontract rechecked from route + tests: 401 / 400 / 429 paths covered; success path still enforces non-emptycited_fact_idssubset validation before returning 200POST /api/proactive/whatsapp-opt-inrechecked from route logic: unauthenticated → 401, invalidphone_e164→ 400, unconfigured provider returns explicit 200 stub response instead of silent success/failure ambiguityGET /api/dashboard/compare-monthsrechecked from tests: 401 / 400 / 200 / 404 / 500 paths covered
.env.local.examplere-audited against current source usage. Run 2 confirmed the example file covers newly used P4 keys as well as runtime/test keys surfaced in code paths:NEXT_PUBLIC_PAYWALL_PROMPT_ENABLED,WHATSAPP_API_URL,WHATSAPP_API_TOKEN,NEXT_PUBLIC_SENTRY_DSN,CI,NODE_ENV,NEXT_RUNTIME,VITEST- Chat input guard remains strict at
1-500chars - WhatsApp opt-in validates E.164 format before any provider call
- Chat still emits
chat_query_submittedbefore theGEMINI_API_KEYavailability check. This is not blocking QA, but it means telemetry can overcount attempted chat usage when chat is unavailable - Weekly recap master/worker telemetry path remains non-blocking on PostHog failure because all awaited captures are individually wrapped with
.catch(...) - WhatsApp provider failure returns explicit 502 when the downstream responds non-OK or the fetch throws; no silent pass-through
- No new blocking performance regressions identified in Run 2
- Residual accepted risk unchanged: chat timeout returns 504 correctly, but the underlying Gemini request is still not hard-aborted at the HTTP layer
- No new blocking UX issues found in Run 2
- Non-blocking analytics caveat retained: chat availability telemetry can lead actual availability unless metric-plan excludes unavailable-session attempts
PASS
- Second QA pass completed under Codex / GPT-5 (
gpt-5) - Automated verification remains green at 109/109
- No new blocking defects found
- One non-blocking follow-up remains relevant for metric planning: treat
chat_query_submittedas attempted demand, not guaranteed chat availability
Recommended next command: /metric-plan