Skip to content

Add /stress-test Claude Code slash command for adversarial QA #3208

@AbanoubGhadban

Description

@AbanoubGhadban

Motivation

We currently rely on unit/integration test suites and manual QA to catch regressions in React on Rails. That coverage misses the failure modes that hurt users most in production: silent data leakage across requests, slow memory growth in the SSR JS context / Pro Node renderer, performance regressions from config changes, and the long tail of "how a novice (or distracted senior) misuses the framework" — case-sensitive component names, props with BigDecimal/Date/NaN, fragment_cache wrapping react_component without a per-user key, 'use client' missing on legacy packs after enabling RSC, etc.

We need an opinionated, repeatable harness that uses the framework adversarially — not one that only reads its source.

Proposal

Add a Claude Code slash command at .claude/commands/stress-test.md that orchestrates a no-mercy QA stress test of the framework. Sub-agents play senior engineers, hackers, and pentesters: they scaffold throwaway demos in tmp/stress-test-<timestamp>/, drive them with extreme/novice/adversarial usage, and report concise findings (≤2 paragraphs each) with repros in sibling files. The maintainer asks for more on demand.

Required cross-cutting concerns (first-class for every vector)

  • Data leakage — cross-request / cross-tenant / client-bundle canary tracing.
  • Memory leakage — RSS/FD slope across N requests, heap snapshots.
  • Performance degradation — p50/p95/p99 latency, throughput, baseline regression.

Argument shape

Form Meaning
(empty) Whole framework at current main
<commit-sha> / <PR#> / PR URL Focus on what that change touches
--from <sha> [--to <sha-or-branch>] Commit range
--features <list> Filter to specific areas (rsc, streaming, rsc-payload, ssr-no-streaming, hydration, auto-bundling, caching, turbo, replay-console, node-renderer, …); intersected with commit scope when both are given
--tier quick|standard|deep|exhaustive Coverage tier (default standard; auto-quick for small commits/PRs)
--max-hours N Hard wallclock ceiling
--no-network-fault, --skip-pro, --repo <path> Toggles

Phases

  1. Scope resolution + feature inventory.
  2. Workspace setup, gem/pnpm packing, LEAK_CANARY_<uuid> planting.
  3. Demo scaffolding (parallel, feature-driven).
  4. Black-box brutal usage round (extreme user / novice / distracted senior / attacker / ops engineer / malicious).
  5. White-box source-targeted attacks (data-leak / memory / perf hypotheses per source area).
  6. Pentest pass (XSS, secret leak, prototype pollution, prompt injection in railsContext, cache poisoning, DoS).
  7. Two-persona doc compare (docs-only vs source-spelunker).
  8. Network-fault simulation (toxiproxy preferred; falls back to SIGSTOP/SIGCONT on demo processes only — never iptables / sudo).
  9. Reporting (markdown only) + gated GitHub issue creation.

Safety

  • Never modifies framework source.
  • Never pushes / commits / opens issues without explicit user approval.
  • Workspace lives under tmp/stress-test-<timestamp>/ (already in .gitignore).
  • Synthetic LEAK_CANARY_<uuid> markers only — no real credentials.
  • No Pro license required (Pro logs warnings; command captures them).

Implementation

PR: #3207

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions