Skip to content

feat(pipeline): dispatcher refactor — ast-grep funnel + citation validator + coverage lint#27

Merged
bborbe merged 1 commit into
masterfrom
feat/dispatcher-refactor
Jun 2, 2026
Merged

feat(pipeline): dispatcher refactor — ast-grep funnel + citation validator + coverage lint#27
bborbe merged 1 commit into
masterfrom
feat/dispatcher-refactor

Conversation

@bborbe

@bborbe bborbe commented Jun 2, 2026

Copy link
Copy Markdown
Owner

Summary

Phase 4-8 of the original task (the dispatcher unlock). With 124 rules in the index and 15 mechanical YAMLs, /coding:pr-review now consumes the rule base via a 3-layer pipeline:

Step 4a:  ast-grep-runner (cheap mechanical funnel; no LLM cost)
            ↓ findings_by_owner JSON
Step 4b:  N per-language agents in parallel (LLM-tier adjudication only)
            ↓ findings citing rule IDs
Step 4d:  validate-citations.sh (drops hallucinated rule_ids)
            ↓ validated findings
Step 5:   consolidated report

What's in this PR

  • agents/ast-grep-runner.md (new) — single-responsibility mechanical funnel; reads rules/index.json, runs ast-grep per YAML, parses JSON output, groups findings by Owner. Read-only, no LLM tools, defined output contract.
  • commands/pr-review.md (rewritten Step 2.5 + Step 4) — dispatcher invokes the runner, fans out per-Owner adjudication concurrently, validates citations, then consolidates. Keeps the context-specific convention reads (teamvault / k8s-binary / k8s-manifest / changelog) intact.
  • scripts/validate-citations.sh (new) — filters review findings against the index's rule_id set; logs drift to stderr.
  • scripts/check-coverage.sh (new) — coverage lint catching three drift modes: enforcement-cited YAML missing, orphan YAML not in index, YAML id mismatch with index.
  • Makefileprecommit chain extended: check-links check-json check-index check-coverage.
  • 3 per-language agents simplified (go-error, go-time, go-context) to the new "expect pre-filtered findings + own judgment-tier rules" shape. Remaining ~25 agents continue in legacy mode; migration is a 30-min/agent follow-up.

Why one PR vs five

Each phase is small individually (<100 LOC) but they reference each other: the dispatcher cites the runner's output contract, the citation validator references the dispatcher's flow, the coverage lint shares the schema discipline. Splitting into 5 PRs would multiply the review-cycle cost without buying review clarity (each PR depends on the prior to make sense).

Test plan

  • `make precommit` clean — links + JSON + check-index + new `check-coverage` all pass
  • `scripts/validate-citations.sh` smoke-tested with valid + invalid rule_ids (correct drop + non-zero exit on invalid)
  • `scripts/check-coverage.sh` against current state: `OK (124 rules, 15 mechanical YAMLs, no drift)`
  • End-to-end smoke against an old merged PR — out of scope for this PR; tracked as follow-up

Follow-ups noted in task page

  • Migrate the remaining ~25 per-language agents to the dispatcher-shaped prompt (30 min/agent, parallelizable)
  • End-to-end smoke against an old PR with known-violating code
  • Add per-Owner timing instrumentation to the dispatcher so the funnel ROI can be measured

…dator + coverage lint

Phase 4-8 of the original task — the dispatcher unlock. The 124-rule
index now has a real pipeline consuming it.

ARCHITECTURE

Before: commands/pr-review.md loaded specific docs (Step 2.5 hardcoded
table) and dispatched a fixed 4-agent list (Step 4). Each agent
re-implemented its own pattern detection on top of reading its
specific doc — no use of rules/index.json or rules/<lang>/*.yml.

After: dispatcher invokes a thin mechanical-funnel agent
(ast-grep-runner), receives findings grouped by Owner, then dispatches
per-Owner judgment-tier adjudication concurrently. Citation validator
drops findings citing missing rule IDs. Coverage lint catches drift
between docs, index, and YAMLs at precommit time.

WHAT'S IN THIS PR

1. agents/ast-grep-runner.md (new) — single-responsibility mechanical
   funnel. Reads rules/index.json, runs ast-grep per YAML, parses
   JSON output, groups findings by Owner. Never invokes LLM tools;
   never modifies files. Defined output contract so the dispatcher
   can pipe it directly to per-language agents.

2. commands/pr-review.md (rewritten Step 2.5 + Step 4) — now:
   - Step 4a: invoke ast-grep-runner → findings grouped by Owner
   - Step 4b: per-Owner concurrent dispatch with adjudication prompt
     containing the pre-filtered findings + judgment-tier rule IDs
     the agent owns
   - Step 4c: context-specific convention reads (kept teamvault /
     k8s-binary / k8s-manifest / changelog mappings as before — these
     are file-pattern triggers, not rule-id-driven)
   - Step 4d: citation validation via scripts/validate-citations.sh
   Per-Owner dispatch fans out concurrently — N agents see N
   independent finding sets, no serialization.

3. scripts/validate-citations.sh (new) — filters review-output
   findings against rules/index.json's rule_id set. Drops findings
   with missing IDs, logs offenders to stderr, exits non-zero if any
   drops occurred (so the dispatcher captures the drift signal but
   still consolidates the validated subset).

4. scripts/check-coverage.sh (new) — coverage lint. Three checks:
   a) every rule's enforcement-cited rules/<lang>/<file>.yml path
      resolves to an existing file
   b) every rules/<lang>/*.yml is referenced by exactly one index
      entry (catches orphan YAMLs after rule renames)
   c) every YAML's id: field matches an index entry's id (catches
      rename drift between doc and YAML)
   Same shape as PR #13's check-index.

5. Makefile — precommit chain extended:
     precommit: check-links check-json check-index check-coverage
   New check-coverage target wraps the script.

6. Three per-language agents (go-error, go-time, go-context)
   simplified to expect pre-filtered findings + judgment-tier rule
   ownership. The rest of the agents continue working in the
   legacy 'scan + judge' mode for now and migrate in follow-up PRs
   — the dispatcher tolerates both shapes during the transition
   (agents that re-scan emit findings; the runner's pre-filtered
   set just doesn't overlap with theirs).

OPERATIONAL NOTES

- The dispatcher refactor is the original Phase 4-8 unlock from
  [[Refactor coding pr-review to doc-driven rules pipeline]]. With
  124 rules in the index and 15 mechanical YAMLs across rules/go/,
  the pipeline has real material to operate on from day one.

- Smoke testing the dispatcher end-to-end against an old PR is the
  next step before this lands on master; smoke fixtures live as
  judgment in /coding:pr-review invocations.

- Per-language agents not yet migrated (the other ~25) continue
  to work in legacy mode — adding their dispatcher-shaped prompt
  is a 30-min change per agent, parallelizable.

make precommit clean (links + JSON + check-index + new check-coverage).
@bborbe bborbe merged commit 1319467 into master Jun 2, 2026
1 check passed
@bborbe bborbe deleted the feat/dispatcher-refactor branch June 2, 2026 20:02
bborbe added a commit that referenced this pull request Jun 2, 2026
…re agents

Completes the dispatcher refactor follow-ups.

commands/code-review.md — Step 4 now mirrors commands/pr-review.md's
dispatcher shape (PR #27): 4a ast-grep-runner mechanical funnel; 4b
per-Owner concurrent adjudication; 4c context-specific convention
reads; 4d citation validation via scripts/validate-citations.sh.
Conditional/full-mode agents that don't yet have RULE blocks
(license, readme-quality, shellcheck, context7, go-version-manager,
go-tooling) continue on the legacy direct-invoke path until their
conventions land as canonical rules.

10 per-language agents migrated to the dispatcher contract — added
'When invoked by the dispatcher' section + Source of truth pointer
to rules/index.json + citation discipline note:

- go-quality-assistant
- go-architecture-assistant
- go-factory-pattern-assistant
- go-http-handler-assistant
- go-metrics-assistant
- go-security-specialist (preserves toolchain fallback for findings
  outside the rule-base scope)
- go-test-quality-assistant
- godoc-assistant
- python-architecture-assistant
- python-quality-assistant

Total agents on dispatcher contract: 13 (3 from PR #27 + 10 here).
Remaining 18 agents are utility / auditor / file-existence-check
agents that don't enforce rules/index.json owners — they don't
need dispatcher contract migration.

Legacy mode preserved on every migrated agent — agents tolerate
both shapes during the rollout (dispatcher invocation + direct
slash-command-style invocation).

make precommit clean (links + JSON + check-index + check-coverage).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant