feat(scenarios): add 3 acceptance scenarios for pr-review and code-review dispatcher by bborbe · Pull Request #37 · bborbe/coding

bborbe · 2026-06-03T07:37:33Z

Summary

Adds three end-to-end acceptance scenarios under `scenarios/` following the dark-factory scenario-writing guide. Each scenario was drafted, audited by `dark-factory:scenario-auditor` twice, and tightened against the audit findings before commit.

#	Title	Validates	Regression risk locked
001	toolchain-preflight	Dispatcher Step 4.0 (PR #36) aborts when ast-grep is absent	coding#34 (silent loop, 30-min activeDeadlineSeconds kill instead of immediate fail)
002	clean-pr-zero-findings	Zero-violation diff in standard mode produces empty findings, no hallucination	LLM tier invents findings to fill an empty mechanical surface
003	scaling-funnel-100-files	100-file synthetic PR completes in ≤30 LLM calls and ≤30 minutes (Phase 10)	coding#27 (30-min activeDeadlineSeconds ceiling)

Four-criteria gate

Each scenario passes the writing-guide's strict bar for adding an E2E scenario:

Unit/integration cannot reach (real PATH resolution, real slash-command invocation, real Claude CLI counting)
Load-bearing user journey
No existing scenario covers it (numbered 001/002/003)
Concrete regression risk tied to a prior incident

Out of scope

The other 5 items in the task-page acceptance suite (mode coverage, per-language routing, context loading, citation discipline, broken-YAML isolation) are better as unit/integration tests on the specific components — they don't clear the four-criteria gate for an E2E scenario.

Test plan

`make precommit` clean (131 rules, 28 mechanical YAMLs, no drift)
Each scenario audited by `dark-factory:scenario-auditor` (twice)
All audit-flagged issues resolved (placeholders → concrete commands, multi-mode → split, internal-string greps → contract-level assertions, ambiguous counter → PATH-shim with positive control)
Bot review
Walk scenario 001 manually to validate the dispatcher fail-fast (after merge)

…view dispatcher Scenarios live in scenarios/ following the dark-factory scenario writing guide (~/Documents/workspaces/dark-factory/docs/rules/scenario-writing.md). All three drafted, audited by dark-factory:scenario-auditor (twice per file), and tightened against the audit findings. 001-toolchain-preflight: the dispatcher fail-fasts when ast-grep is absent from PATH. Validates PR #36's preflight wiring. The regression risk being locked down is the 30-min activeDeadlineSeconds kill observed on coding#34 instead of immediate-fail. PATH-masking technique is host-safe. 002-clean-pr-zero-findings: a README-only diff flows through standard mode end-to-end and surfaces no findings. Locks down the regression risk that the LLM tier hallucinates findings to fill the void when the mechanical layer surfaces none. Standard mode only — short-mode and full-mode variants are separate journeys (split per the scenario-writing guide). 003-scaling-funnel-100-files: a synthetic 100-file PR with known mechanical violations completes /coding:pr-review master standard in ≤30 LLM calls and ≤30 minutes. Phase 10 of the doc-driven pipeline. LLM-call counter implemented as a PATH-shim wrapper around the claude CLI. Companion scenario 004 (rule-count doubling decoupling) deferred — split per the one-journey-per-file rule. Each scenario meets the four-criteria gate from the writing guide: unit/integration cannot reach the behavior, load-bearing user journey, no existing scenario, nameable regression risk tied to a concrete prior incident (coding#34 / coding#27 / Phase 10).

ben-s-pull-request-reviewer

Reviewed by ben-s-pull-request-reviewer[bot] — no concerns flagged.

ben-s-pull-request-reviewer

{
  "verdict": "approve",
  "summary": "PR #37 has already been merged into master (commit 3b6adf2). The diff against master is empty — no reviewable changes remain. Manual review of the three new scenario files (001-toolchain-preflight.md, 002-clean-pr-zero-findings.md, 003-scaling-funnel-100-files.md) found two minor observations: Scenario 003 lacks a `trap` to guarantee `$WORK` cleanup on early-exit failure paths (the Cleanup step won't run if the scenario errors out midway), and all three scenarios remain `status: draft` with no documented promotion path to `status: accepted` if that's a project convention requirement.",
  "comments": [
    {
      "file": "scenarios/003-scaling-funnel-100-files.md",
      "line": 52,
      "severity": "minor",
      "message": "Should Fix (Important): The Cleanup step (line 69) runs only on success. If the scenario exits early due to an error or unmet assertion, `rm -rf \"$WORK\"` never executes and the mktemp directory leaks. Adding `trap 'rm -rf \"$WORK\"' EXIT` near the top of the Action block would guarantee cleanup regardless of exit path."
    },
    {
      "file": "scenarios/001-toolchain-preflight.md",
      "line": 13,
      "severity": "nit",
      "message": "Nice to Have (Optional): The `command -v ast-grep` check (line 17-18) validates PATH resolution but doesn't confirm the binary is actually executable. Consider adding `[ -x \"$(command -v ast-grep)\" ]` to verify not just visibility but executability in environments where PATH contains non-executable scripts with the same name."
    }
  ],
  "concerns_addressed": [
    "performance: Scenario 003 shim wrapper is reliable for session-scoped Claude CLI invocations — bypass is a known limitation documented in the scenario",
    "correctness: Scenario 001 dispatcher PATH preflight is validated — the expected `ast-grep/sg not in PATH` error message is specified and verifiable",
    "correctness: Scenario 002 report format uses literal `None.` strings — fragile by design (explicit trade-off documented), verified by specific awk extraction commands",
    "security: Scenario 003 cleanup runs on success but not on early-exit failure — see finding above",
    "tests: All scenarios have status: draft — no promotion criteria documented; may be intentional for test-contract documentation"
  ]
}

…bot review Bot correctly flagged that scenarios/004 was not referenced in README.md or llms.txt; broader audit found scenarios/001-003 were also never indexed (oversight in PRs #37, #38, #40). Closing the gap retroactively for all 4 active scenarios. README.md: new 'Acceptance Scenarios' section above Contributing with a 4-row table linking each scenario file + one-line 'Validates...' description. llms.txt: parallel section at the bottom so AI agents relying on llms.txt for discovery can find the scenarios alongside guides.

/coding:commit's pipeline-only detection lumped scenarios/ with prompts/ and specs/ — meaning a PR adding ONLY scenario files would route to Workflow E (commit + push, no changelog entry). That's wrong for repos where scenarios are shipped acceptance contracts that users invoke via /dark-factory:run-scenario; in that role they're release-relevant artifacts on the same footing as docs/ and rules/. Without this fix, today's PRs #37 (3 scenarios drafted), #40 (scenarios rewritten + promoted to active), and #41 (4th scenario added) would each have shipped with no changelog record. The v0.15.0 release notes that this PR drafts had to retroactively list scenarios as a feature. Updated commands/commit.md:139-153 — drop scenarios/ from the pipeline-only list and the corresponding rationale paragraph. Prompts and specs remain pipeline-only (dark-factory daemon runtime state, not shipped artifacts).

ben-s-pull-request-reviewer Bot reviewed Jun 3, 2026

View reviewed changes

bborbe merged commit 3b6adf2 into master Jun 3, 2026
1 check passed

bborbe deleted the feat/add-scenarios branch June 3, 2026 07:42

ben-s-pull-request-reviewer Bot approved these changes Jun 3, 2026

View reviewed changes

ben-s-pull-request-reviewer Bot mentioned this pull request Jun 3, 2026

docs(changelog): consolidate Unreleased entries since v0.14.0 #42

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(scenarios): add 3 acceptance scenarios for pr-review and code-review dispatcher#37

feat(scenarios): add 3 acceptance scenarios for pr-review and code-review dispatcher#37
bborbe merged 1 commit into
masterfrom
feat/add-scenarios

bborbe commented Jun 3, 2026

Uh oh!

ben-s-pull-request-reviewer Bot left a comment

Uh oh!

Uh oh!

ben-s-pull-request-reviewer Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bborbe commented Jun 3, 2026

Summary

Four-criteria gate

Out of scope

Test plan

Uh oh!

ben-s-pull-request-reviewer Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ben-s-pull-request-reviewer Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant