Motivation
We currently rely on unit/integration test suites and manual QA to catch regressions in React on Rails. That coverage misses the failure modes that hurt users most in production: silent data leakage across requests, slow memory growth in the SSR JS context / Pro Node renderer, performance regressions from config changes, and the long tail of "how a novice (or distracted senior) misuses the framework" — case-sensitive component names, props with BigDecimal/Date/NaN, fragment_cache wrapping react_component without a per-user key, 'use client' missing on legacy packs after enabling RSC, etc.
We need an opinionated, repeatable harness that uses the framework adversarially — not one that only reads its source.
Proposal
Add a Claude Code slash command at .claude/commands/stress-test.md that orchestrates a no-mercy QA stress test of the framework. Sub-agents play senior engineers, hackers, and pentesters: they scaffold throwaway demos in tmp/stress-test-<timestamp>/, drive them with extreme/novice/adversarial usage, and report concise findings (≤2 paragraphs each) with repros in sibling files. The maintainer asks for more on demand.
Required cross-cutting concerns (first-class for every vector)
- Data leakage — cross-request / cross-tenant / client-bundle canary tracing.
- Memory leakage — RSS/FD slope across N requests, heap snapshots.
- Performance degradation — p50/p95/p99 latency, throughput, baseline regression.
Argument shape
| Form |
Meaning |
| (empty) |
Whole framework at current main |
<commit-sha> / <PR#> / PR URL |
Focus on what that change touches |
--from <sha> [--to <sha-or-branch>] |
Commit range |
--features <list> |
Filter to specific areas (rsc, streaming, rsc-payload, ssr-no-streaming, hydration, auto-bundling, caching, turbo, replay-console, node-renderer, …); intersected with commit scope when both are given |
--tier quick|standard|deep|exhaustive |
Coverage tier (default standard; auto-quick for small commits/PRs) |
--max-hours N |
Hard wallclock ceiling |
--no-network-fault, --skip-pro, --repo <path> |
Toggles |
Phases
- Scope resolution + feature inventory.
- Workspace setup, gem/pnpm packing,
LEAK_CANARY_<uuid> planting.
- Demo scaffolding (parallel, feature-driven).
- Black-box brutal usage round (extreme user / novice / distracted senior / attacker / ops engineer / malicious).
- White-box source-targeted attacks (data-leak / memory / perf hypotheses per source area).
- Pentest pass (XSS, secret leak, prototype pollution, prompt injection in
railsContext, cache poisoning, DoS).
- Two-persona doc compare (docs-only vs source-spelunker).
- Network-fault simulation (toxiproxy preferred; falls back to
SIGSTOP/SIGCONT on demo processes only — never iptables / sudo).
- Reporting (markdown only) + gated GitHub issue creation.
Safety
- Never modifies framework source.
- Never pushes / commits / opens issues without explicit user approval.
- Workspace lives under
tmp/stress-test-<timestamp>/ (already in .gitignore).
- Synthetic
LEAK_CANARY_<uuid> markers only — no real credentials.
- No Pro license required (Pro logs warnings; command captures them).
Implementation
PR: #3207
Motivation
We currently rely on unit/integration test suites and manual QA to catch regressions in React on Rails. That coverage misses the failure modes that hurt users most in production: silent data leakage across requests, slow memory growth in the SSR JS context / Pro Node renderer, performance regressions from config changes, and the long tail of "how a novice (or distracted senior) misuses the framework" — case-sensitive component names, props with
BigDecimal/Date/NaN,fragment_cachewrappingreact_componentwithout a per-user key,'use client'missing on legacy packs after enabling RSC, etc.We need an opinionated, repeatable harness that uses the framework adversarially — not one that only reads its source.
Proposal
Add a Claude Code slash command at
.claude/commands/stress-test.mdthat orchestrates a no-mercy QA stress test of the framework. Sub-agents play senior engineers, hackers, and pentesters: they scaffold throwaway demos intmp/stress-test-<timestamp>/, drive them with extreme/novice/adversarial usage, and report concise findings (≤2 paragraphs each) with repros in sibling files. The maintainer asks for more on demand.Required cross-cutting concerns (first-class for every vector)
Argument shape
main<commit-sha>/<PR#>/ PR URL--from <sha> [--to <sha-or-branch>]--features <list>--tier quick|standard|deep|exhaustivestandard; auto-quickfor small commits/PRs)--max-hours N--no-network-fault,--skip-pro,--repo <path>Phases
LEAK_CANARY_<uuid>planting.railsContext, cache poisoning, DoS).SIGSTOP/SIGCONTon demo processes only — never iptables / sudo).Safety
tmp/stress-test-<timestamp>/(already in.gitignore).LEAK_CANARY_<uuid>markers only — no real credentials.Implementation
PR: #3207