Skip to content

Commit a25b612

Browse files
Add shared AI interaction summary and operator UX (#98)
* chore: verify CI lint workflow and apply black/lint updates * Overhaul autonomy docs and harden CPU chaos validation - Add comprehensive autonomy core package (contracts, map/twin state, planner, predictor, readiness) with tests and integration points. - Extend backend with autonomy insight endpoints and align OpenAPI coverage. - Expand HUD and C2 surfaces with autonomy KPI, safety, recommendation, and timeline overlays. - Normalize drone telemetry ingest confidence/health metadata and map query contract support. - Switch node-agent to CPU-only PyTorch wheels to reduce container footprint in CI/dev runs. - Harden chaos suite fallback by passing admin auth headers for trigger_fl. - Fix monitoring package integration script/test wiring and refresh repository docs with a full autonomy operations guide plus updated README/C2/monitoring/changelog. * Add concrete AI interaction implementation plan - Turn the AI usability recommendations into a phased, ticketable plan. - Cover command bar, structured recommendations, approval flow, model routing, metadata, mission context, and search. - Define rollout criteria, validation strategy, and first-branch execution order for the new feature branch. * Add shared AI interaction summary and operator UX - add the /ai/interaction/summary backend payload with quick actions and recommendations - wire the HUD and C2 surfaces to the shared interaction summary - fix the app-shell fetch path, add focused C2 coverage, and refresh docs - update the AI interaction plan to record completed work and follow-up items * Merge origin/main into feature/ai-interaction-concrete-plan resolving all conflicts Agent-Logs-Url: https://github.com/rwilliamspbg-ops/Sovereign_Map_Federated_Learning/sessions/46437f5b-1ee4-4823-9841-3307a0d01ca0 Co-authored-by: rwilliamspbg-ops <221235059+rwilliamspbg-ops@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
1 parent 8eea96a commit a25b612

8 files changed

Lines changed: 877 additions & 9 deletions

File tree

AI_INTERACTION_CONCRETE_PLAN.md

Lines changed: 280 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,280 @@
1+
# AI Interaction and Ease-of-Use Plan
2+
3+
## Objective
4+
5+
Make the system easier to use, easier to trust, and easier to drive with AI-assisted interactions by reducing manual setup, exposing structured recommendations, and keeping advanced controls available but not dominant.
6+
7+
## Progress Update
8+
9+
Implemented so far:
10+
11+
1. Shared AI interaction summary endpoint for HUD and C2.
12+
2. Natural-language command bar and structured recommendation cards in the main HUD.
13+
3. Quick-action routing and summary previews in C2.
14+
15+
Remaining follow-up work:
16+
17+
1. Add explicit approve/edit/reject/undo loops for AI-suggested actions.
18+
2. Expand model-routing heuristics beyond the current summary/planner split.
19+
3. Persist user interaction history so operators can review prior AI decisions.
20+
21+
## Product Principles
22+
23+
1. Minimize the number of user decisions required for the common path.
24+
2. Show a clear reason, confidence, and consequence for every AI recommendation.
25+
3. Keep human approval in the loop for any action that changes mission state.
26+
4. Hide prompt engineering behind product actions and model routing.
27+
5. Default to the simplest path, but let experts expand into full control.
28+
29+
## Current Surface To Build On
30+
31+
1. Primary HUD at `frontend/src/HUD.jsx`.
32+
2. C2 view at `frontend/src/C2SwarmHUD.jsx`.
33+
3. Control-plane backend in `sovereignmap_production_backend_v2.py`.
34+
4. Autonomy core in `internal/autonomy/*`.
35+
5. Live ops events from SSE and metrics endpoints already in place.
36+
37+
## Recommendation Map
38+
39+
### 1) Make the main interaction path one-step
40+
41+
What to build:
42+
43+
1. A single natural-language command bar for the primary HUD.
44+
2. Pinned quick actions for the most common workflows.
45+
3. Context auto-fill from current mission, map, policy, and twin state.
46+
47+
Implementation details:
48+
49+
1. Parse user intent into a structured action request.
50+
2. Show the interpreted action before execution.
51+
3. Require one confirmation click for state-changing actions.
52+
53+
Acceptance criteria:
54+
55+
1. Users can request a common task in one sentence.
56+
2. The UI shows the interpreted action, confidence, and required inputs.
57+
3. No manual navigation is needed for the default workflow.
58+
59+
### 2) Expose structured AI suggestions, not free-form text
60+
61+
What to build:
62+
63+
1. Recommendation cards with action, reason, confidence, and expected outcome.
64+
2. Safe alternatives when the top action is blocked by policy.
65+
3. Clear rejection reasons when the AI declines to act.
66+
67+
Implementation details:
68+
69+
1. Use a typed payload for suggestions from backend to frontend.
70+
2. Normalize outputs into a common shape: `action`, `reason`, `confidence`, `risk`, `expected_gain`, `blocked_reason`.
71+
3. Render the same structure in HUD and C2.
72+
73+
Acceptance criteria:
74+
75+
1. Every recommendation is explainable in one glance.
76+
2. The same recommendation format is reused across views.
77+
3. Blocked actions always show why they were blocked.
78+
79+
### 3) Add approve/edit/reject loops for every AI action
80+
81+
What to build:
82+
83+
1. Buttons for approve, edit, reject, and undo.
84+
2. A review drawer that lets users modify the AI-suggested action before execution.
85+
3. A visible audit trail of user decisions.
86+
87+
Implementation details:
88+
89+
1. Keep state changes behind a confirmation boundary.
90+
2. Save the user override reason with the action event.
91+
3. Include rollback hooks for reversible actions.
92+
93+
Acceptance criteria:
94+
95+
1. The user can accept, change, or reject any AI suggestion.
96+
2. Every decision is logged with who approved it and why.
97+
3. Undo is available for reversible operations.
98+
99+
### 4) Use progressive disclosure for expert controls
100+
101+
What to build:
102+
103+
1. A simple default mode.
104+
2. An expert mode that reveals model selection, thresholds, and raw telemetry.
105+
3. A “Do it for me” mode for routine tasks and a manual override mode for advanced users.
106+
107+
Implementation details:
108+
109+
1. Keep advanced controls collapsed until explicitly requested.
110+
2. Preserve a single route to the same underlying action, regardless of mode.
111+
3. Make mode switches persistent per user preference.
112+
113+
Acceptance criteria:
114+
115+
1. New users see a smaller, less intimidating surface.
116+
2. Power users can still access raw telemetry and model controls.
117+
3. Mode changes never change the underlying safety checks.
118+
119+
### 5) Route requests to the right model automatically
120+
121+
What to build:
122+
123+
1. A lightweight model-router layer that selects a model by task type.
124+
2. Default routing for classification, summarization, planning, and map reasoning.
125+
3. Manual override for power users and debugging.
126+
127+
Implementation details:
128+
129+
1. Define task classes such as `summary`, `planner`, `map_reasoning`, `safety_review`, and `search`.
130+
2. Route based on latency budget, cost, and output shape.
131+
3. Keep routing decisions visible in debug mode.
132+
133+
Acceptance criteria:
134+
135+
1. The user does not need to choose a model for common tasks.
136+
2. The system picks an appropriate model automatically.
137+
3. Routing decisions are explainable when expanded.
138+
139+
### 6) Return structured outputs and metadata every time
140+
141+
What to build:
142+
143+
1. Standard response envelopes for AI answers.
144+
2. Confidence, assumptions, sources, and freshness fields.
145+
3. Support for multimodal output where appropriate.
146+
147+
Implementation details:
148+
149+
1. Enforce a shared schema across backend responses.
150+
2. Include `confidence`, `assumptions`, `source_span`, `freshness_secs`, and `next_action`.
151+
3. Make the UI render charts, tables, and map overlays from the same response family.
152+
153+
Acceptance criteria:
154+
155+
1. AI responses are consistent and machine-readable.
156+
2. Users can see trust signals without opening developer tools.
157+
3. The UI can render the same answer as text, card, or overlay.
158+
159+
### 7) Improve mission awareness and search
160+
161+
What to build:
162+
163+
1. Live context panels that auto-load map, telemetry, policy, and recent events.
164+
2. Fast search over past decisions and operator actions.
165+
3. A timeline that explains why the AI changed its recommendation.
166+
167+
Implementation details:
168+
169+
1. Load context automatically on page open.
170+
2. Index decisions, replan triggers, overrides, and twin changes.
171+
3. Let the user ask questions like “why did the system reroute here?”
172+
173+
Acceptance criteria:
174+
175+
1. Users can inspect prior decisions in seconds.
176+
2. The system can answer “why” questions using recent state and audit history.
177+
3. The timeline links cause, decision, and outcome.
178+
179+
## Implementation Phases
180+
181+
### Phase 1: Command Bar and Recommendation Cards
182+
183+
Goal: reduce the first action path to one sentence and one confirmation.
184+
185+
Deliverables:
186+
187+
1. Command bar in `frontend/src/HUD.jsx`.
188+
2. Recommendation cards for action/reason/confidence/risk.
189+
3. Backend payload shape for structured AI suggestions.
190+
191+
### Phase 2: Explainability and Safety Review
192+
193+
Goal: make every AI action inspectable before execution.
194+
195+
Deliverables:
196+
197+
1. Review drawer with approve/edit/reject/undo.
198+
2. Safety panel with policy state and blocked-reason display.
199+
3. Audit trail events for all AI-assisted decisions.
200+
201+
### Phase 3: Model Routing and Structured Metadata
202+
203+
Goal: remove manual model choice from the common path.
204+
205+
Deliverables:
206+
207+
1. Task-type router service.
208+
2. Standard AI response envelope.
209+
3. Debug mode that shows model choice and fallback logic.
210+
211+
### Phase 4: Mission Context and Search
212+
213+
Goal: make the system answerable and self-explanatory.
214+
215+
Deliverables:
216+
217+
1. Context auto-loading panels.
218+
2. Decision search and replay.
219+
3. Timeline linking inputs to outputs and outcomes.
220+
221+
### Phase 5: Rollout and Hardening
222+
223+
Goal: deploy safely with measurable usability gains.
224+
225+
Deliverables:
226+
227+
1. Feature flags for command bar, recommendation cards, and expert mode.
228+
2. Canary rollout plan.
229+
3. UX telemetry dashboard.
230+
231+
## Suggested Ticket Breakdown
232+
233+
1. UX-001: Build the natural-language command bar.
234+
2. UX-002: Add structured recommendation cards.
235+
3. UX-003: Add approve/edit/reject/undo workflow.
236+
4. UX-004: Add expert mode and progressive disclosure.
237+
5. ML-001: Add automatic model routing.
238+
6. ML-002: Add structured response envelopes.
239+
7. OPS-001: Add context auto-loading and search.
240+
8. OPS-002: Add decision timeline and replay.
241+
242+
## Metrics To Track
243+
244+
1. time_to_first_action_secs
245+
2. command_success_rate
246+
3. recommendation_acceptance_rate
247+
4. edit_after_suggestion_rate
248+
5. override_rate
249+
6. model_routing_accuracy
250+
7. answer_confidence_display_rate
251+
8. decision_replay_usage
252+
253+
## Rollout Criteria
254+
255+
1. New users can complete the common path without a tutorial.
256+
2. Recommendation cards reduce back-and-forth interaction.
257+
3. Operator overrides stay within acceptable bounds.
258+
4. No safety regressions are introduced by model routing or structured outputs.
259+
260+
## Validation Plan
261+
262+
1. Unit tests for command parsing, recommendation ranking, and response schema validation.
263+
2. Integration tests for backend payloads and HUD rendering.
264+
3. E2E tests for command bar -> suggestion -> confirm -> execute -> audit trail.
265+
4. Usability review with the simplest path measured in clicks and time-to-action.
266+
267+
## First Branch Execution Order
268+
269+
1. Create the command bar and recommendation card components.
270+
2. Add structured AI response envelopes in the backend.
271+
3. Wire approve/edit/reject/undo actions to the audit trail.
272+
4. Add model routing and expert-mode toggles.
273+
5. Add search and timeline replay after the core flow is stable.
274+
275+
## Definition of Done
276+
277+
1. The common path is shorter and clearer for new users.
278+
2. AI suggestions are structured, explainable, and actionable.
279+
3. Advanced controls remain available without overwhelming the default view.
280+
4. The system can justify model choice, action choice, and safety decisions.

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ Traditional federated learning can look simple in a lab and fail badly in the fi
2727
- One command boots the full stack and its observability surfaces.
2828
- A small local node set exercises aggregation, policy checks, and proof verification.
2929
- The runtime exposes the same health path operators use in production-style demos.
30+
- The primary HUD and C2 view share a structured AI interaction summary so common actions stay explainable and easy to trigger.
3031

3132
## Try It Now
3233

docs/api/openapi.yaml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -579,6 +579,20 @@ paths:
579579
type: object
580580
additionalProperties: true
581581

582+
/ai/interaction/summary:
583+
get:
584+
tags: [Operations]
585+
summary: Get structured AI interaction summary for HUD and C2
586+
operationId: getAiInteractionSummary
587+
responses:
588+
'200':
589+
description: AI interaction summary payload
590+
content:
591+
application/json:
592+
schema:
593+
type: object
594+
additionalProperties: true
595+
582596
/model_registry:
583597
get:
584598
tags: [Operations]

frontend/src/App.jsx

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ function App() {
6969
const [hudData, setHudData] = useState(null);
7070
const [health, setHealth] = useState(null);
7171
const [metricsSummary, setMetricsSummary] = useState(null);
72+
const [interactionSummary, setInteractionSummary] = useState(null);
7273
const [trustStatus, setTrustStatus] = useState(null);
7374
const [policyHistory, setPolicyHistory] = useState([]);
7475
const [founders, setFounders] = useState([]);
@@ -263,10 +264,11 @@ function App() {
263264
const fetchData = async () => {
264265
setLoading(true);
265266
try {
266-
const [hudFetch, healthFetch, metricsFetch, foundersFetch, trainingFetch, opsHealthFetch, opsTrendsFetch] = await Promise.allSettled([
267+
const [hudFetch, healthFetch, metricsFetch, interactionFetch, foundersFetch, trainingFetch, opsHealthFetch, opsTrendsFetch] = await Promise.allSettled([
267268
fetch(`${API_BASE}/hud_data`),
268269
fetch(`${API_BASE}/health`),
269270
fetch(`${API_BASE}/metrics_summary`),
271+
fetch(`${API_BASE}/ai/interaction/summary`),
270272
fetch(`${API_BASE}/founders`),
271273
fetch(`${API_BASE}/training/status`),
272274
fetch(`${API_BASE}/ops/health`),
@@ -285,10 +287,11 @@ function App() {
285287
}
286288
};
287289

288-
const [hud, healthData, metrics, foundersData, trainingData, opsHealthData, opsTrendsData] = await Promise.all([
290+
const [hud, healthData, metrics, interactionData, foundersData, trainingData, opsHealthData, opsTrendsData] = await Promise.all([
289291
safeJson(toResponse(hudFetch), hudData || {}),
290292
safeJson(toResponse(healthFetch), health || {}),
291293
safeJson(toResponse(metricsFetch), metricsSummary || {}),
294+
safeJson(toResponse(interactionFetch), interactionSummary || {}),
292295
safeJson(toResponse(foundersFetch), founders || []),
293296
safeJson(toResponse(trainingFetch), { status: 'idle', active: false, round: 0 }),
294297
safeJson(toResponse(opsHealthFetch), opsHealth || null),
@@ -359,6 +362,7 @@ function App() {
359362
setHudData(mergedHudData);
360363
setHealth(healthData);
361364
setMetricsSummary(metrics);
365+
setInteractionSummary(interactionData);
362366
setFounders(foundersData);
363367
setTrainingStatus(nextTraining);
364368
setOpsHealth(nextOpsHealth);
@@ -457,16 +461,17 @@ function App() {
457461
}
458462
};
459463

460-
const submitVoiceQuery = async () => {
461-
if (!voiceQuery.trim()) {
464+
const submitVoiceQuery = async (overrideQuery) => {
465+
const query = String(overrideQuery ?? voiceQuery ?? '').trim();
466+
if (!query) {
462467
return;
463468
}
464469

465470
try {
466471
const response = await fetch(`${API_BASE}/chat`, {
467472
method: 'POST',
468473
headers: { 'Content-Type': 'application/json' },
469-
body: JSON.stringify({ query: voiceQuery })
474+
body: JSON.stringify({ query })
470475
});
471476
if (response.ok) {
472477
const result = await response.json();
@@ -573,7 +578,7 @@ function App() {
573578
}
574579

575580
if (showC2SwarmHUD) {
576-
return <C2SwarmHUD apiBase={API_BASE} />;
581+
return <C2SwarmHUD apiBase={API_BASE} interactionSummary={interactionSummary} />;
577582
}
578583

579584
return (
@@ -597,6 +602,7 @@ function App() {
597602
hudData={hudData}
598603
health={health}
599604
metricsSummary={metricsSummary}
605+
interactionSummary={interactionSummary}
600606
trustStatus={trustStatus}
601607
policyHistory={policyHistory}
602608
founders={founders}

0 commit comments

Comments
 (0)