Skip to content

Commit f9b8463

Browse files
feat: complete open ecosystem upgrade package with marketplace ops, observability, tests, and docs
1 parent 482fa57 commit f9b8463

20 files changed

Lines changed: 6674 additions & 113 deletions

Dockerfile.frontend

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ WORKDIR /app
44
COPY frontend/package*.json ./
55
RUN npm install
66
COPY frontend/ ./
7+
ARG VITE_DEFAULT_VIEW=hud
8+
ENV VITE_DEFAULT_VIEW=${VITE_DEFAULT_VIEW}
79
RUN npm run build
810

911
# Stage 2: Serve

Makefile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -164,9 +164,9 @@ check: fmt vet lint-soft test
164164
alerts-test:
165165
@echo "🚨 Running Prometheus alert rule tests..."
166166
@docker run --rm --entrypoint /bin/promtool -v "$$(pwd):/workspace" -w /workspace prom/prometheus:v2.48.0 \
167-
check rules fl_slo_alerts.yml fl_detailed_alerts.yml tpm_alerts.yml
167+
check rules fl_slo_alerts.yml fl_detailed_alerts.yml tpm_alerts.yml marketplace_alerts.yml
168168
@docker run --rm --entrypoint /bin/promtool -v "$$(pwd):/workspace" -w /workspace prom/prometheus:v2.48.0 \
169-
test rules fl_slo_alerts.test.yml fl_detailed_alerts.test.yml tpm_alerts.test.yml
169+
test rules fl_slo_alerts.test.yml fl_detailed_alerts.test.yml tpm_alerts.test.yml marketplace_alerts.test.yml
170170
@$(GO) test ./internal/monitoring -run "TestAlertmanagerRoutingPolicy|TestAlertmanagerInhibitionPolicy"
171171
@echo "✅ Alert rule tests passed"
172172

README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,8 @@ Production-grade federated learning platform that combines Byzantine-resilient a
2121
[![Dashboards Upgrade](https://img.shields.io/badge/Grafana-STARRED%20Live%20Dashboards-f59e0b?style=flat-square&logo=grafana&logoColor=white)](grafana/provisioning/dashboards)
2222
[![PySyft Demo](https://img.shields.io/badge/PySyft-Mohawk%20PoC%20Ready-10b981?style=flat-square)](examples/pysyft-integration)
2323
[![Prometheus Ready](https://img.shields.io/badge/Prometheus-Scrape%20Ready-ef4444?style=flat-square&logo=prometheus&logoColor=white)](prometheus.yml)
24+
[![Open Ecosystem](https://img.shields.io/badge/Open%20Ecosystem-Sprint%203%20Local--First-0ea5e9?style=flat-square)](docs/OPEN_ECOSYSTEM_FIRST_10_MINUTES.md)
25+
[![Marketplace Alerts](https://img.shields.io/badge/Alerts-Marketplace%20Guardrails-f97316?style=flat-square)](marketplace_alerts.yml)
2426

2527
## Mobile Shield Update March 2026
2628

@@ -80,6 +82,34 @@ Operator validation commands:
8082
- `make observability-smoke`
8183
- `python3 scripts/check_dashboard_queries.py`
8284

85+
## Open Ecosystem Upgrade March 2026
86+
87+
This upgrade package adds a local-first marketplace and governance workflow with production-facing observability guardrails.
88+
89+
What is included:
90+
91+
- Marketplace flows: offers, intents, matching, escrow release, dispute workflows, and governance proposals/voting.
92+
- Network expansion flows: attestation sharing, self-service invite requests, admin approval/rejection/revocation.
93+
- Dashboard and metrics integration: marketplace/governance snapshots in `/metrics_summary` and expanded HUD browser demo controls.
94+
- Prometheus additions: `marketplace_alerts.yml` with stall/high-watermark detection plus promtool tests in `marketplace_alerts.test.yml`.
95+
- API contract tests: local positive-path and negative-path coverage under `tests/scripts/python/test_marketplace_local_contracts.py` and `tests/scripts/python/test_marketplace_negative_paths.py`.
96+
97+
Primary references:
98+
99+
- First 10 minutes guide: [docs/OPEN_ECOSYSTEM_FIRST_10_MINUTES.md](docs/OPEN_ECOSYSTEM_FIRST_10_MINUTES.md)
100+
- Sprint 1 roadmap: [docs/OPEN_ECOSYSTEM_SPRINT1_ROADMAP.md](docs/OPEN_ECOSYSTEM_SPRINT1_ROADMAP.md)
101+
- Sprint 2 roadmap: [docs/OPEN_ECOSYSTEM_SPRINT2_ROADMAP.md](docs/OPEN_ECOSYSTEM_SPRINT2_ROADMAP.md)
102+
- API examples: [docs/api/http-examples.md](docs/api/http-examples.md)
103+
- Backend implementation: [sovereignmap_production_backend_v2.py](sovereignmap_production_backend_v2.py)
104+
- Grafana operations dashboard: [grafana/provisioning/dashboards/operations_overview.json](grafana/provisioning/dashboards/operations_overview.json)
105+
106+
Validation commands:
107+
108+
- `make observability-smoke`
109+
- `make alerts-test`
110+
- `python3 tests/scripts/python/test_marketplace_local_contracts.py`
111+
- `python3 tests/scripts/python/test_marketplace_negative_paths.py`
112+
83113
## Performance Tuning Knobs
84114

85115
The following environment variables are available for safe runtime tuning:

dashboard_compat_rules.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -94,10 +94,10 @@ groups:
9494
expr: tpm_ca_certificate_valid
9595

9696
# per-node trust score: verified/total ratio when certs exist;
97-
# "> -Inf" filters out NaN (0/0 case) so the "or" fallback to CA validity
98-
# triggers correctly when no certificates have been issued yet.
97+
# clamp/min guards keep this finite and in the expected 0-100 range,
98+
# while fallback uses CA validity when no certificates are present.
9999
- record: tpm_node_trust_score
100-
expr: (tpm_certificates_verified_total / tpm_certificates_total > -Inf) or tpm_ca_certificate_valid
100+
expr: (clamp_max(100 * (tpm_certificates_verified_total / clamp_min(tpm_certificates_total, 1)), 100) and (tpm_certificates_total > 0)) or (100 * tpm_ca_certificate_valid)
101101

102102
# message signing operations — proxy via total certificate issuances
103103
- record: tpm_messages_signed_total

docker-compose.full.yml

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ services:
44
build:
55
context: .
66
dockerfile: Dockerfile.frontend
7+
args:
8+
VITE_DEFAULT_VIEW: ${VITE_DEFAULT_VIEW:-hud}
79
image: frontend:latest
810
container_name: sovereign-frontend
911
ports:
@@ -15,6 +17,24 @@ services:
1517
networks:
1618
- sovereign-network
1719
restart: always
20+
21+
frontend-admin:
22+
build:
23+
context: .
24+
dockerfile: Dockerfile.frontend
25+
args:
26+
VITE_DEFAULT_VIEW: browser_demo
27+
image: frontend-admin:latest
28+
container_name: sovereign-frontend-admin
29+
ports:
30+
- "${FRONTEND_ADMIN_HOST_PORT:-3003}:80"
31+
environment:
32+
- NODE_ENV=production
33+
depends_on:
34+
- backend
35+
networks:
36+
- sovereign-network
37+
restart: always
1838
# ========================================================================
1939
# SOVEREIGN MAPS BACKEND (Flower Aggregator + Flask Metrics)
2040
# ========================================================================

docs/ALERT_RUNBOOKS.md

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@ This document defines first-response procedures for SLO and consensus alerts.
77
### Routing and Inhibition Baseline
88

99
- Route policy source: [alertmanager.yml](../alertmanager.yml)
10-
- Rule sources: [fl_slo_alerts.yml](../fl_slo_alerts.yml), [fl_detailed_alerts.yml](../fl_detailed_alerts.yml), [tpm_alerts.yml](../tpm_alerts.yml)
11-
- Unit test sources: [fl_slo_alerts.test.yml](../fl_slo_alerts.test.yml), [fl_detailed_alerts.test.yml](../fl_detailed_alerts.test.yml), [tpm_alerts.test.yml](../tpm_alerts.test.yml), [internal/monitoring/alertmanager_config_test.go](../internal/monitoring/alertmanager_config_test.go)
10+
- Rule sources: [fl_slo_alerts.yml](../fl_slo_alerts.yml), [fl_detailed_alerts.yml](../fl_detailed_alerts.yml), [tpm_alerts.yml](../tpm_alerts.yml), [marketplace_alerts.yml](../marketplace_alerts.yml)
11+
- Unit test sources: [fl_slo_alerts.test.yml](../fl_slo_alerts.test.yml), [fl_detailed_alerts.test.yml](../fl_detailed_alerts.test.yml), [tpm_alerts.test.yml](../tpm_alerts.test.yml), [marketplace_alerts.test.yml](../marketplace_alerts.test.yml), [internal/monitoring/alertmanager_config_test.go](../internal/monitoring/alertmanager_config_test.go)
1212

1313
Inhibition semantics:
1414

@@ -66,9 +66,9 @@ Inhibition semantics:
6666

6767
### Coverage Summary
6868

69-
- Total alerts configured: 34
70-
- Alerts with explicit runbook section in this document: 16
71-
- Alerts with promtool rule unit tests: 34
69+
- Total alerts configured: 36
70+
- Alerts with explicit runbook section in this document: 18
71+
- Alerts with promtool rule unit tests: 36
7272
- Alertmanager routing and inhibition policy tests: covered by internal/monitoring/alertmanager_config_test.go
7373

7474
## FLRoundStalled
@@ -166,3 +166,15 @@ Inhibition semantics:
166166
- Verify replay rate trend from `mohawk_tpm_nonce_replay_rejections_total` and determine whether it is expected (duplicate retries) or anomalous (nonce generation collision/replay attack).
167167
- Correlate with client retry storms and transport retransmissions; high replay without failure spikes usually indicates duplicate delivery.
168168
- If anomalous, rotate nonce derivation context for the affected round and audit ingress paths for duplicate submissions.
169+
170+
## MarketplaceEscrowStalled
171+
172+
- Confirm `sovereign_marketplace_escrow_locked` is non-zero and verify `increase(sovereign_marketplace_payout_total[30m]) == 0` in Prometheus UI.
173+
- Inspect pending contracts via `/marketplace/contracts?payout_status=pending` and verify no active disputes are blocking payout release.
174+
- Triage release path by checking `/marketplace/escrow/release` API logs and recent governance actions for moderation holds.
175+
176+
## MarketplaceEscrowHighWatermark
177+
178+
- Validate current locked amount against expected round budget and contract volume.
179+
- Inspect for stale contracts that remain pending after round completion and release in controlled batches.
180+
- If sustained, tighten intent budget limits or increase release cadence to keep escrow within policy bounds.
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# Open Ecosystem First 10 Minutes (Local)
2+
3+
This guide runs entirely local.
4+
5+
## Prerequisites
6+
7+
1. Backend API available at `http://localhost:8000`.
8+
2. Frontend available at `http://localhost:3000` (or local Vite port).
9+
10+
## 1. Create Offer
11+
12+
```bash
13+
curl -s -X POST http://localhost:8000/marketplace/offers \
14+
-H 'Content-Type: application/json' \
15+
-d '{
16+
"seller_node_id": "node-quickstart-1",
17+
"dataset_fingerprint": "sha256:quickstart-local-001",
18+
"title": "Quickstart Image Pack",
19+
"modality": "image",
20+
"quality_score": 0.84,
21+
"allowed_tasks": ["classification"],
22+
"price_per_round": 10.0,
23+
"min_rounds": 1,
24+
"attestation_status": "verified"
25+
}' | jq
26+
```
27+
28+
## 2. Create Intent
29+
30+
```bash
31+
curl -s -X POST http://localhost:8000/marketplace/round_intents \
32+
-H 'Content-Type: application/json' \
33+
-d '{
34+
"model_owner_id": "owner-quickstart",
35+
"task_type": "classification",
36+
"required_modalities": ["image"],
37+
"min_quality_score": 0.7,
38+
"budget_total": 100
39+
}' | jq
40+
```
41+
42+
Capture `round_intent_id` from the response.
43+
44+
## 3. Match Contract
45+
46+
```bash
47+
curl -s -X POST http://localhost:8000/marketplace/match \
48+
-H 'Content-Type: application/json' \
49+
-d '{"round_intent_id": "intent-REPLACE_ME", "max_offers": 3}' | jq
50+
```
51+
52+
Capture `contract_id` from the response.
53+
54+
## 4. Trigger One Training Round
55+
56+
```bash
57+
curl -s -X POST http://localhost:8000/trigger_fl | jq
58+
```
59+
60+
## 5. Release Escrow
61+
62+
```bash
63+
curl -s -X POST http://localhost:8000/marketplace/escrow/release \
64+
-H 'Content-Type: application/json' \
65+
-d '{"contract_id": "contract-REPLACE_ME"}' | jq
66+
```
67+
68+
## 6. Inspect Contract Timeline and Metrics
69+
70+
```bash
71+
curl -s http://localhost:8000/marketplace/contracts | jq '.contracts[0].timeline'
72+
curl -s http://localhost:8000/training/status | jq '.marketplace_pending_contract'
73+
curl -s http://localhost:8000/metrics_summary | jq '.marketplace'
74+
```
75+
76+
## Troubleshooting
77+
78+
1. `no_compatible_offers_found`:
79+
80+
- Check `details.rejection_reasons` in response.
81+
- Increase `budget_total` or reduce quality threshold.
82+
83+
1. `round_intent_not_open`:
84+
85+
- Intent was already matched/cancelled/closed.
86+
- Create a new intent or patch status appropriately.
87+
88+
1. `contract_already_released`:
89+
90+
- Escrow for that contract is already released.
Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# Open Ecosystem Sprint 1 Roadmap
2+
3+
## Sprint Goal
4+
5+
Deliver a user-friendly local-first marketplace loop that a new user can complete in one session:
6+
7+
1. Create offer
8+
2. Create intent
9+
3. Match contract
10+
4. Trigger training round
11+
5. Release escrow
12+
6. Inspect timeline and metrics
13+
14+
## Duration
15+
16+
- 2 weeks
17+
18+
## Scope
19+
20+
### P0 (Must Deliver)
21+
22+
1. Deterministic marketplace API error codes and messages.
23+
2. Match failure diagnostics to explain why offers did not match.
24+
3. Contract lifecycle timeline (created -> bound_to_round -> escrow_released).
25+
4. Dashboard visibility for pending contracts and marketplace summary.
26+
5. Positive and negative-path local tests.
27+
6. First-10-minutes onboarding guide.
28+
29+
### P1 (If Capacity Allows)
30+
31+
1. UI score breakdown view (quality, cost, trust).
32+
2. Intent status workflow guardrails in controls.
33+
3. Lightweight policy preview before matching.
34+
35+
## User Stories
36+
37+
1. As a data provider, I can create an offer with clear field guidance and immediate validation.
38+
2. As a model owner, I can understand exactly why matching failed.
39+
3. As an operator, I can see contract state transitions in chronological order.
40+
4. As a reviewer, I can verify ecosystem activity through metrics and operation events.
41+
5. As a new integrator, I can finish the full local flow in 10 minutes.
42+
43+
## Acceptance Criteria
44+
45+
1. Marketplace endpoints return stable error codes and messages on all validation failures.
46+
2. Match failure includes machine-readable rejection reasons and counts.
47+
3. Every matched contract includes timeline events.
48+
4. `training/status` includes the pending contract summary.
49+
5. `metrics_summary` includes marketplace snapshot.
50+
6. Local smoke and negative tests pass.
51+
7. Frontend build passes.
52+
53+
## Risks and Mitigations
54+
55+
1. Risk: ambiguous match outcomes.
56+
57+
- Mitigation: include rejection reason counters and budget rejection count.
58+
59+
1. Risk: accidental status misuse.
60+
61+
- Mitigation: enforce intent status transitions server-side.
62+
63+
1. Risk: duplicate escrow release.
64+
65+
- Mitigation: explicit `contract_already_released` error.
66+
67+
## Definition of Done
68+
69+
1. Backend, frontend, tests, and docs updated.
70+
2. No diagnostics errors in touched files.
71+
3. Local backend marketplace smoke test passes.
72+
4. Local backend negative-path test passes.
73+
5. Frontend production build passes.
74+
75+
## Demo Checklist
76+
77+
1. Create an offer from the UI.
78+
2. Create an intent from the UI.
79+
3. Run a match and inspect status.
80+
4. Trigger one FL round and inspect contract binding.
81+
5. Release escrow and confirm updated timeline.
82+
6. Inspect marketplace section in `/metrics_summary`.
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Open Ecosystem Sprint 2 Roadmap
2+
3+
## Sprint Goal
4+
5+
Deliver trust and governance transparency for marketplace operations:
6+
7+
1. Explainable scoring in matching outcomes.
8+
2. Local dispute workflow for contract issues.
9+
3. Governance action logging surfaces.
10+
4. Dashboard visibility for dispute and governance activity.
11+
12+
## Duration
13+
14+
- 2 weeks
15+
16+
## Scope
17+
18+
### P0 (Must Deliver)
19+
20+
1. Match score breakdown included per selected offer.
21+
2. Governance activity endpoints (create/list).
22+
3. Dispute endpoints (create/list/update).
23+
4. Metrics summary governance snapshot.
24+
5. UI rendering of score breakdown and governance activity.
25+
6. Automated test coverage for new endpoints.
26+
27+
### P1 (If Capacity Allows)
28+
29+
1. Dispute SLA timers and escalation status.
30+
2. Governance action filtering by actor and source.
31+
3. Policy proposal voting workflow stub.
32+
33+
## User Stories
34+
35+
1. As a buyer, I can see why an offer was selected using score components.
36+
2. As an operator, I can submit disputes for problematic contracts.
37+
3. As a moderator, I can update dispute status and leave resolution notes.
38+
4. As a governance observer, I can see recent governance actions in one view.
39+
40+
## Acceptance Criteria
41+
42+
1. `/marketplace/match` includes `score_breakdown` and `selection_diagnostics`.
43+
2. `/marketplace/disputes` supports create/list.
44+
3. `/marketplace/disputes/<id>` supports status updates.
45+
4. `/governance/actions` supports create/list.
46+
5. `/metrics_summary` includes governance snapshot.
47+
6. Frontend displays score breakdown and recent governance actions.
48+
7. Local tests and frontend build pass.
49+
50+
## Definition of Done
51+
52+
1. Backend API endpoints implemented and documented.
53+
2. Frontend views expose explainability and governance visibility.
54+
3. Tests validate score, dispute, and governance workflows.
55+
4. No diagnostics errors in touched files.

0 commit comments

Comments
 (0)