This plan defines the next concrete execution slice after PR #99 and focuses on three outcomes:
- Production-safe auth and policy for interaction decision logging.
- Searchable, replayable digital-twin decision context for operators.
- Stability gates for merge confidence (tests, lint, and strict soak checks).
The current interaction workflow is operational, but merge confidence and operator trust require:
- Explicit auth boundaries for
/ai/interaction/decision. - Backend contract tests for interaction APIs.
- Search/replay quality improvements in the HUD twin perspective.
Files expected to change:
sovereignmap_production_backend_v2.pyREADME.mddocs/api/openapi.yaml
Concrete implementation:
- Add an explicit policy mode for interaction decisions:
AI_INTERACTION_DECISION_AUTH_MODE=public_local|admin_required
- Default to
admin_requiredfor production-safe behavior. - Allow
public_localonly for local/test flows with explicit env opt-in. - Return structured auth failure payload with remediation hint.
Acceptance criteria:
- Decision POST is blocked unless policy mode allows request context.
- Auth mode is documented with examples.
- Existing local dev flow still works when enabled explicitly.
Files expected to change:
tests/scripts/python/(new focused test module)- Optional test helpers under existing test utilities
Concrete implementation:
- Add tests for:
GET /ai/interaction/summaryGET /ai/interaction/historyPOST /ai/interaction/decision
- Cover both positive and negative paths:
- valid decision values (
approve,edit,reject,undo) - invalid decision rejection
- auth-restricted behavior under strict mode
- valid decision values (
- Assert response envelope fields are present and typed.
Acceptance criteria:
- Tests fail if schema-critical keys are removed.
- Tests fail if auth mode behavior regresses.
- Tests run in CI without requiring external services.
Files expected to change:
frontend/src/HUD.jsx- Optional helper/style files if needed
Concrete implementation:
- Add a decision search input (filter by action, decision, route, reason).
- Add replay list controls:
- sort by time (newest/oldest)
- quick filters (
approve,reject,undo)
- Add a replay detail panel for selected decision:
- model route
- operator reason
- associated prompt
- timestamp and review ID
Acceptance criteria:
- Operators can find a prior decision in under 10 seconds.
- Search and filters work with both backend and locally-added entries.
- Replay detail remains readable on mobile and desktop.
Files expected to change:
AI_INTERACTION_PHASE4_CONCRETE_EXECUTION_PLAN.md(status updates)- Optional docs updates if commands change
Concrete implementation:
- Run and record targeted checks:
python3 -m py_compile sovereignmap_production_backend_v2.pynpm --prefix frontend run lintnpm --prefix frontend run test -- --run src/HUD.test.jsx src/C2SwarmHUD.test.jsxnpm --prefix frontend run build
- Run strict chaos guard and attach pass/fail details:
SOAK_CHAOS_ENABLED=1 SOAK_CHAOS_STRICT=1 CHAOS_MIN_CLIENT_QUORUM=1 python3 tests/scripts/python/test_soak_chaos_guard.py
Acceptance criteria:
- Required checks are documented in PR body.
- Any remaining warnings are explicitly called out with owner/follow-up.
- Strict soak result is reported, not omitted.
- D1 auth hardening + docs.
- D2 backend contract tests.
- D3 search/replay UI.
- D4 full validation and PR evidence updates.
AI-AUTH-001: Add interaction decision auth mode and enforcement.AI-AUTH-002: Document auth mode and testing examples.AI-TEST-001: Add summary/history/decision backend contract tests.AI-UX-001: Implement decision search and quick filters in HUD.AI-UX-002: Implement replay detail panel.AI-OPS-001: Run validation matrix and publish evidence.
- Risk: Auth hardening may break local test UX.
- Mitigation: explicit
public_localmode and startup logs.
- Mitigation: explicit
- Risk: Replay UX increases HUD complexity.
- Mitigation: default collapsed panel and simple filter presets.
- Risk: Strict soak remains unstable.
- Mitigation: capture first failing assertion and gate merge on documented disposition.
- Auth mode implemented and documented.
- Interaction API contract tests added and passing.
- HUD replay search/filter/detail shipped.
- Lint/test/build checks completed.
- Strict soak run completed with outcome included.
- PR description includes validation commands and residual risk notes.