Skip to content

Latest commit

 

History

History
116 lines (75 loc) · 10.9 KB

File metadata and controls

116 lines (75 loc) · 10.9 KB

Exploration — Issue 010: MoneyMirror Phase 3

Date: 2026-04-05
Agent: Research Agent
Stage: explore
Issue: issue-010.md — Phase 3 umbrella (unified dashboard, transaction-native insights, expert AI coaching)


Problem Analysis

Assumption (to validate in product): Users who upload multiple bank and/or credit card statements over time need a single mental model for money: global date range, explicit source scope, and line-level evidence before category rollups and coaching feel trustworthy.

Verified fact (from issue-009 exploration + shipped Phases 1–2): The core MoneyMirror wedge — perception gap, India-specific parsing, consequence-first tone — is already validated as a hair-on-fire problem for the ICP. Phase 3 does not re-litigate that; it addresses credibility at scale when data volume and heterogeneity increase.

Current-state pain (product, not market):

Symptom Why it hurts
Single-statement / month-picker bias Users with 2+ uploads cannot reason “across everything I’ve given you.”
Bucket-only insights (“Wants is high”) Feels like a generic budgeting app; no merchant proof (e.g. Zomato ₹X / N txns).
Generative coaching without server facts JSON Model-invented numbers destroy trust; regulatory exposure if copy reads like personalized investment advice.
Split mental models for bank vs card Multi-account Gen Z reality is one wallet; UI must expose a clear scope model and how perceived vs actual behaves under rollups.

Conclusion: The problem is real for retained / multi-upload users and for north-star behaviors (second upload, return sessions) that depend on “this matches my PDF / my life.”


Market Scan

Phase 3 is primarily depth vs Indian fintech incumbents, not a new category entry. Findings align with exploration-009.md:

Layer Competitors / alternatives Strength Gap relevant to Phase 3
Aggregation Jupiter / Fi, INDMoney, CRED Scale, brand, rewards Coach-first + user-owned PDF truth still rare; none optimize for txn-level explainability + behavioral coaching with India-specific tone.
Manual Excel, Notion, screenshots Full control No parsing, no nudges, high friction.
AI chat ChatGPT, etc. Fluency No durable ledger; hallucination risk on amounts — exactly what facts-grounded coaching avoids.

Unserved gap (Phase 3 angle): A transaction-native surface with merchant rollups and deterministic numbers in AI — i.e. “show your work” finance coaching for India, mobile-first.


User Pain Level

Classification: Moderate → Critical (segment-dependent).

Segment Level Reasoning
First-time / single-statement user Moderate Phase 2 may be “enough” until they upload again or add a card.
Multi-statement / bank + card user Critical for trust Mismatch between app and intuition (“why doesn’t this match?”) becomes support-grade; without txn truth, North Star (repeat upload) stalls.
PM / owner (delivery) Moderate Scope overlap (VIJ-25 vs Phase 3) creates process pain, not user pain — solvable by merge/supersede (already logged in project state).

Net: For the Phase 3 ICP (multiple uploads, multi-source), pain intensity is high — not because budgeting is novel, but because trust collapses without evidence and scope clarity.


Opportunity Assessment

Dimension Assessment
Value High — unlocks differentiated “guide” positioning vs bucket dashboards; reduces generic-coach feel.
Market size Same TAM as issue-009; Phase 3 increases retention and depth within the wedge, not TAM expansion.
Willingness to adopt High if performance stays snappy and copy stays educational, not advice (see risks).
Distribution No new channel; benefits existing users and referrals who hit multi-statement reality.

Hypothesis (from issue file, refined):

If we ship transaction surface + unified scope + merchant rollups + facts-grounded AI for users with multiple statements, they will trust category and coaching views more, return to upload a second statement sooner, and reduce “doesn’t match my PDF” confusion because the UI always points to line-level evidence.


Proposed MVP Experiment

Smallest slice that validates the riskiest assumptions: T1 — Transaction surface + API (P0 foundation).

Element Definition
Core Server-backed transactions list (pagination, filters aligned with future scope), one “proof” path from insight → txn rows (even if merchant rollup is thin in v1).
Intentionally excluded Full T5–T6 (IA/growth, compare months) per PM plan; defer heavy Gemini merchant normalization until T3 patterns are clear — start with heuristics + deterministic rollups where possible.
Learning goals (1) Parser variance exposure across issuers in the wild; (2) query/latency envelope for cross-statement reads; (3) whether users engage with txn truth before caring about richer AI.

Success signals (exploration-level, not final metrics):

  • Technical: p95 API latency acceptable on realistic account sizes; no unbounded scans.
  • Product: qualitative — users can answer “where did this number come from?” via txn drill-down.

Risks

Risk Type Mitigation (directional)
PDF / parser variance Technical Issuer-specific tests; graceful degradation; never claim precision you don’t have in UI copy.
Merchant normalization (Zomato vs ZOMATO vs UPI) Technical / cost v1 heuristics + dedupe rules; optional Gemini post-pass only where ROI clear; structured outputs; cost/latency budget in plan.
Regulatory / product safety Market / legal Extend docs/COACHING-TONE.md with expert examples; educational, not advice; facts JSON only as numeric source for generative layer.
Performance at multi-statement scale Technical Indexed queries, pagination, .limit() on lists; avoid N+1; align with architecture-guide DB-first and serverless limits.
Perceived vs actual under global rollup Product Explicit PRD decision before UI lock: blended vs per-account — wrong default erodes the Mirror brand.
Scope duplication (VIJ-25) Process Superseded by VIJ-37 / T1–T4 mapping (per project-state); enforce single backlog.

Final Recommendation

Build.

Phase 3 is the natural depth layer for an already shipped product with a validated ICP. It directly attacks trust and repeat engagement — prerequisites for the North Star proxy (second-month upload) at scale. Risks are known and plannable; none are “kill the initiative” without mitigation paths.

Explore further only as spikes inside /create-plan, not as a delay: merchant normalization strategy, rollup semantics for perceived spend, and performance budgets should be decision records in plan-010.md, not open-ended research.

Discard — not applicable; the umbrella is coherent and aligned with repo + Linear (VIJ-37 / VIJ-38–VIJ-41).


Next step

Run /create-plan to produce experiments/plans/plan-010.md with PRD, UX, architecture, schema, and acceptance criteria per T1–T6 (with T5–T6 explicitly deferred/backlogged as in issue-010).