Skip to content

feat(#256): autonomous Tor-vs-clearnet benchmark harness + finalized methodology#293

Open
VijitSingh97 wants to merge 12 commits into
developfrom
feat/256-benchmark-harness
Open

feat(#256): autonomous Tor-vs-clearnet benchmark harness + finalized methodology#293
VijitSingh97 wants to merge 12 commits into
developfrom
feat/256-benchmark-harness

Conversation

@VijitSingh97

@VijitSingh97 VijitSingh97 commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Supersedes the #268 draft. The gate for the final #165/#166 Tor-default decision.

What

  • docs/benchmarks/tor-vs-clearnet.md — finalized methodology: full fleet (miner-0..7, ~269 kH/s) on mini (~1.8% of the sidechain → ~155 of our shares/day — enough power, representative; nano would make us ~11% and overstate the penalty, main is too sparse), interleaved A/B (T/C/T/C blocks, settling discarded) to control time-correlated variance, run to a target share count, clearnet arm = noise floor. Plus the autonomous-on-gouda architecture + runbook.
  • bench-orchestrate.sh — autonomous driver (interleave + egress-gate + collector + per-cycle health + status.json heartbeat + self-heal). Runs detached on gouda; cron watchdog + @reboot restart it; resumes from state on crash/reboot. So the run survives the operator being offline for days.
  • bench-healthcheck.sh — read-only checklist → OK/WARN/BROKEN.
  • bench-status.sh — one-screen summary for ssh gouda bench-status over Tailscale.
  • bench-collect.sh — per-interval /stats snapshotter (brought from [draft] benchmark: Tor vs clearnet while mining — methodology (#256) #268).

The live run

Live on gouda since 2026-06-17: --pool mini --block-hours 20 --blocks 6 --settle-hours 3 (~5 days, 3 blocks/arm, ≈done 2026-06-22), XvB disabled so the full ~269 kH/s reaches p2pool in both arms — the auto optimizer otherwise routes most of the fleet into XvB (chasing the Whale tier) and confounds reward_share, the primary metric. Only the p2pool transport (and the #270 firewall) flips between arms; monerod + Tari stay on Tor throughout.

Two benchmark-blocking bugs surfaced + fixed (both validated live on gouda)

  1. set_arm now toggles the Enforce Tor-only egress at the network layer (fail-closed) — stop relying on per-app config #270 egress firewall with the arm — ON for tor (fail-closed, egress-gated), OFF for clearnet (else the firewall DROPs p2pool's clearnet dials → 0 sidechain peers → silent garbage).
  2. Real pithead bug (network.tor_egress_firewall=false never disables the #270 firewall (jq // false-coercion) #294): network.tor_egress_firewall=false never disabled the firewall — jq's // true coerces an explicit false back to true, so the documented Enforce Tor-only egress at the network layer (fail-closed) — stop relying on per-app config #270 opt-out was dead since it shipped. Same latent bug on .xvb.tor. Extracted into a config_bool() helper + regression-tested (tests/stack: 386 passed). Validated: clearnet arm went 0 → 54 peers; tor switch-back rigorous-egress all-clean.

Validated end-to-end on gouda

A fast 2-block dry run exercised the whole machine (arm switch via pithead apply, egress-gate=PASS/skip, per-arm collector, bench-status, RUN COMPLETE, clean self-restore). shellcheck clean.

Merge after the run completes (~2026-06-22): the per-arm results + the final #165/#166 Tor-default recommendation land in docs/privacy.md in this PR before merge, so closing #256 delivers the actual decision (not just the harness).

Closes #256
Closes #294

🤖 Generated with Claude Code

VijitSingh97 and others added 12 commits June 17, 2026 11:09
…rd the collector

Finalize the methodology against real numbers: full fleet miner-0..7 (~269 kH/s) on
the `mini` sidechain (~1.8% of it → ~155 of our shares/day — representative + enough
power; `nano` would make us ~11% and overstate the penalty, `main` is too sparse).
Switch the design to an INTERLEAVED A/B (T/C/T/C in ~1.5-2d blocks, discard ~6h post
switch) to control time-correlated variance, run to a target share count, and use the
clearnet arm's own variance as the noise floor.

Add the "autonomous on gouda" section + runbook: the run lives entirely on the box
(detached orchestrator + cron watchdog + @reboot) so it survives the operator being
offline for days; observe over Tailscale via `bench-status`. Scripts (orchestrate /
healthcheck / status) land next, validated on gouda before the real run.

Brings bench-collect.sh onto develop (was only on the #256 draft); supersedes #268.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…+ status)

Three gouda-resident scripts so the Tor-vs-clearnet run survives the operator being
offline for days:

- bench-orchestrate.sh — interleaves the arms on a fixed block schedule (T/C/T/C),
  egress-gates each switch (rigorous multi-poll), keeps the collector running,
  health-checks each cycle, writes a status.json heartbeat, and self-heals (re-up
  stack + re-gate on BROKEN). All state in ~/pithead-bench, so a crash/reboot
  resumes from state. Subcommands: run | --calibrate | --watchdog | --status |
  --stop | --install-cron (the watchdog cron + @reboot restart it if it dies).
- bench-healthcheck.sh — read-only checklist -> OK/WARN/BROKEN (stack, p2pool,
  hashrate, Tor-arm egress + firewall); reused by the loop + the cron.
- bench-status.sh — one-screen summary for `ssh gouda bench-status` over Tailscale.

shellcheck clean. To be validated on gouda (dry-run) before the real fleet run.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ecs via awk)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-WARN during the ramp

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…status missing-jsonl guard

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…s) to avoid false BROKEN on brief Tari dials

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… noisy share-based hashrate estimate

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…t read as failing (old launch events linger in the tail)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ng the fleet to XvB (target Whale), starving p2pool and confounding reward_share

The XVB_DONATION_LEVEL=auto optimizer dynamically splits the fleet to climb
raffle tiers — observed ~96kH/s to XvB vs ~18kH/s to p2pool while chasing the
Whale tier. That starves p2pool (so reward_share, the primary metric, is
measured on a fraction of the fleet) and makes the split a function of Tor
latency — the exact thing we isolate. Disabling XvB sends the full ~269kH/s to
p2pool in both arms. Re-asserted on every apply so a switch/recovery can't let
it drift back on; --stop restores xvb.enabled=true.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e p2pool gets 0 peers

set_arm flipped p2pool.clearnet but left the Tor-egress firewall (#270) ON, which
DROPs direct clearnet dials from the container subnet. The clearnet arm would have
collected zero-peer/zero-share garbage for 3 of 6 blocks — and since both runs so
far only exercised block-1 TOR, it would have first bitten at the first switch
(~20h in), unattended. Now the firewall tracks the arm: ON for tor (fail-closed,
egress-gated), OFF for clearnet (p2pool peers over clearnet — the baseline). --stop
restores firewall=true so a mid-clearnet-block stop never leaves gouda open.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…all (jq // false-coercion)

`jq -r '.network.tor_egress_firewall // true'` returns true even when the key is
explicitly false, because jq's // alternative treats false (not just null) as
empty. So the documented #270 opt-out has never worked — the firewall was always
on regardless of config. Same latent bug on `.xvb.tor // true` (xvb.tor=false
silently stayed on Tor). Both now null-check explicitly so a configured false is
honoured; absent still defaults on (fail-closed / Tor-by-default preserved).

Surfaced by the #256 benchmark: the clearnet arm sets tor_egress_firewall=false
so p2pool can peer over clearnet, but the firewall stayed up → 0 sidechain peers.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…jq false-coercion bug

Replaces the two inline null-checks (TOR_EGRESS_FIREWALL, xvb.tor) with a shared
config_bool() helper that honours an explicit false, and adds tests/stack unit
coverage (explicit false/true, absent→default). The prior firewall test only
exercised apply_tor_egress_firewall reading .env — it bypassed the config.json→.env
jq that actually had the bug. Behavior-identical to d4b5df3 (which is what gouda's
live benchmark runs); this is the cleaner, tested form for develop.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant