feat(#256): autonomous Tor-vs-clearnet benchmark harness + finalized methodology by VijitSingh97 · Pull Request #293 · p2pool-starter-stack/pithead

VijitSingh97 · 2026-06-17T16:40:07Z

Supersedes the #268 draft. The gate for the final #165/#166 Tor-default decision.

What

docs/benchmarks/tor-vs-clearnet.md — finalized methodology: full fleet (miner-0..7, ~269 kH/s) on mini (~1.8% of the sidechain → ~155 of our shares/day — enough power, representative; nano would make us ~11% and overstate the penalty, main is too sparse), interleaved A/B (T/C/T/C blocks, settling discarded) to control time-correlated variance, run to a target share count, clearnet arm = noise floor. Plus the autonomous-on-gouda architecture + runbook.
bench-orchestrate.sh — autonomous driver (interleave + egress-gate + collector + per-cycle health + status.json heartbeat + self-heal). Runs detached on gouda; cron watchdog + @reboot restart it; resumes from state on crash/reboot. So the run survives the operator being offline for days.
bench-healthcheck.sh — read-only checklist → OK/WARN/BROKEN.
bench-status.sh — one-screen summary for ssh gouda bench-status over Tailscale.
bench-collect.sh — per-interval /stats snapshotter (brought from [draft] benchmark: Tor vs clearnet while mining — methodology (#256) #268).

The live run

Live on gouda since 2026-06-17: --pool mini --block-hours 20 --blocks 6 --settle-hours 3 (~5 days, 3 blocks/arm, ≈done 2026-06-22), XvB disabled so the full ~269 kH/s reaches p2pool in both arms — the auto optimizer otherwise routes most of the fleet into XvB (chasing the Whale tier) and confounds reward_share, the primary metric. Only the p2pool transport (and the #270 firewall) flips between arms; monerod + Tari stay on Tor throughout.

Two benchmark-blocking bugs surfaced + fixed (both validated live on gouda)

set_arm now toggles the Enforce Tor-only egress at the network layer (fail-closed) — stop relying on per-app config #270 egress firewall with the arm — ON for tor (fail-closed, egress-gated), OFF for clearnet (else the firewall DROPs p2pool's clearnet dials → 0 sidechain peers → silent garbage).
Real pithead bug (network.tor_egress_firewall=false never disables the #270 firewall (jq // false-coercion) #294): network.tor_egress_firewall=false never disabled the firewall — jq's // true coerces an explicit false back to true, so the documented Enforce Tor-only egress at the network layer (fail-closed) — stop relying on per-app config #270 opt-out was dead since it shipped. Same latent bug on .xvb.tor. Extracted into a config_bool() helper + regression-tested (tests/stack: 386 passed). Validated: clearnet arm went 0 → 54 peers; tor switch-back rigorous-egress all-clean.

Validated end-to-end on gouda

A fast 2-block dry run exercised the whole machine (arm switch via pithead apply, egress-gate=PASS/skip, per-arm collector, bench-status, RUN COMPLETE, clean self-restore). shellcheck clean.

Merge after the run completes (~2026-06-22): the per-arm results + the final #165/#166 Tor-default recommendation land in docs/privacy.md in this PR before merge, so closing #256 delivers the actual decision (not just the harness).

Closes #256
Closes #294

🤖 Generated with Claude Code

@reboot

…rd the collector Finalize the methodology against real numbers: full fleet miner-0..7 (~269 kH/s) on the `mini` sidechain (~1.8% of it → ~155 of our shares/day — representative + enough power; `nano` would make us ~11% and overstate the penalty, `main` is too sparse). Switch the design to an INTERLEAVED A/B (T/C/T/C in ~1.5-2d blocks, discard ~6h post switch) to control time-correlated variance, run to a target share count, and use the clearnet arm's own variance as the noise floor. Add the "autonomous on gouda" section + runbook: the run lives entirely on the box (detached orchestrator + cron watchdog + @reboot) so it survives the operator being offline for days; observe over Tailscale via `bench-status`. Scripts (orchestrate / healthcheck / status) land next, validated on gouda before the real run. Brings bench-collect.sh onto develop (was only on the #256 draft); supersedes #268. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@reboot

…+ status) Three gouda-resident scripts so the Tor-vs-clearnet run survives the operator being offline for days: - bench-orchestrate.sh — interleaves the arms on a fixed block schedule (T/C/T/C), egress-gates each switch (rigorous multi-poll), keeps the collector running, health-checks each cycle, writes a status.json heartbeat, and self-heals (re-up stack + re-gate on BROKEN). All state in ~/pithead-bench, so a crash/reboot resumes from state. Subcommands: run | --calibrate | --watchdog | --status | --stop | --install-cron (the watchdog cron + @reboot restart it if it dies). - bench-healthcheck.sh — read-only checklist -> OK/WARN/BROKEN (stack, p2pool, hashrate, Tor-arm egress + firewall); reused by the loop + the cron. - bench-status.sh — one-screen summary for `ssh gouda bench-status` over Tailscale. shellcheck clean. To be validated on gouda (dry-run) before the real fleet run. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ecs via awk) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…-WARN during the ramp Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…status missing-jsonl guard Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…s) to avoid false BROKEN on brief Tari dials Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… noisy share-based hashrate estimate Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…t read as failing (old launch events linger in the tail) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ng the fleet to XvB (target Whale), starving p2pool and confounding reward_share The XVB_DONATION_LEVEL=auto optimizer dynamically splits the fleet to climb raffle tiers — observed ~96kH/s to XvB vs ~18kH/s to p2pool while chasing the Whale tier. That starves p2pool (so reward_share, the primary metric, is measured on a fraction of the fleet) and makes the split a function of Tor latency — the exact thing we isolate. Disabling XvB sends the full ~269kH/s to p2pool in both arms. Re-asserted on every apply so a switch/recovery can't let it drift back on; --stop restores xvb.enabled=true. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…e p2pool gets 0 peers set_arm flipped p2pool.clearnet but left the Tor-egress firewall (#270) ON, which DROPs direct clearnet dials from the container subnet. The clearnet arm would have collected zero-peer/zero-share garbage for 3 of 6 blocks — and since both runs so far only exercised block-1 TOR, it would have first bitten at the first switch (~20h in), unattended. Now the firewall tracks the arm: ON for tor (fail-closed, egress-gated), OFF for clearnet (p2pool peers over clearnet — the baseline). --stop restores firewall=true so a mid-clearnet-block stop never leaves gouda open. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…all (jq // false-coercion) `jq -r '.network.tor_egress_firewall // true'` returns true even when the key is explicitly false, because jq's // alternative treats false (not just null) as empty. So the documented #270 opt-out has never worked — the firewall was always on regardless of config. Same latent bug on `.xvb.tor // true` (xvb.tor=false silently stayed on Tor). Both now null-check explicitly so a configured false is honoured; absent still defaults on (fail-closed / Tor-by-default preserved). Surfaced by the #256 benchmark: the clearnet arm sets tor_egress_firewall=false so p2pool can peer over clearnet, but the firewall stayed up → 0 sidechain peers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…jq false-coercion bug Replaces the two inline null-checks (TOR_EGRESS_FIREWALL, xvb.tor) with a shared config_bool() helper that honours an explicit false, and adds tests/stack unit coverage (explicit false/true, absent→default). The prior firewall test only exercised apply_tor_egress_firewall reading .env — it bypassed the config.json→.env jq that actually had the bug. Behavior-identical to d4b5df3 (which is what gouda's live benchmark runs); this is the cleaner, tested form for develop. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

VijitSingh97 and others added 12 commits June 17, 2026 11:09

fix(#256): float-safe block/settle hours in the orchestrator (hours→s…

0ef4c4f

…ecs via awk) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(#256): healthcheck uses hashrate_15m (not 1h) so it doesn't false…

d3b6dc3

…-WARN during the ramp Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(#256): gate self-heals grandfathered Tari leak (restart) + bench-…

a1cbd47

…status missing-jsonl guard Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(#256): per-cycle healthcheck egress matches gate rigor (3-poll/30…

f1c5f0b

…s) to avoid false BROKEN on brief Tari dials Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(#256): healthcheck keys on proxy_workers (reliable), not p2pool's…

9f3b718

… noisy share-based hashrate estimate Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

fix(#256): bench-status leads with 'stable for Xh' so quiet runs don'…

707cbdc

…t read as failing (old launch events linger in the tail) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

This was referenced Jun 17, 2026

network.tor_egress_firewall=false never disables the #270 firewall (jq // false-coercion) #294

Open

[draft] benchmark: Tor vs clearnet while mining — methodology (#256) #268

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(#256): autonomous Tor-vs-clearnet benchmark harness + finalized methodology#293

feat(#256): autonomous Tor-vs-clearnet benchmark harness + finalized methodology#293
VijitSingh97 wants to merge 12 commits into
developfrom
feat/256-benchmark-harness

VijitSingh97 commented Jun 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

VijitSingh97 commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

The live run

Two benchmark-blocking bugs surfaced + fixed (both validated live on gouda)

Validated end-to-end on gouda

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

VijitSingh97 commented Jun 17, 2026 •

edited

Loading