Skip to content

feat(precommit): add check-index target to catch walker-output drift#13

Merged
bborbe merged 1 commit into
masterfrom
feat/precommit-index-check
Jun 2, 2026
Merged

feat(precommit): add check-index target to catch walker-output drift#13
bborbe merged 1 commit into
masterfrom
feat/precommit-index-check

Conversation

@bborbe

@bborbe bborbe commented Jun 2, 2026

Copy link
Copy Markdown
Owner

Summary

Adds `check-index` to the `make precommit` chain so doc edits that change `### RULE` blocks must also include the regenerated `rules/index.json`. Fail-loud beats silent drift.

Closes the gap exposed by PR #9 → PR #10: PR #9 trimmed a doc with pre-canonicalized rule blocks but didn't run `make build-index` — the 7 new `agent-cmd/*` entries were silently absent from `rules/index.json` until PR #10 retroactively picked them up.

Implementation

`check-index` regenerates the walker output to a tmp file, diffs against committed `rules/index.json`, fails with the diff + actionable instruction if they differ:

```make
.PHONY: check-index
check-index:
@python3 scripts/build-index.py > /tmp/coding-rules-index-check.json
@if ! diff -q rules/index.json /tmp/coding-rules-index-check.json > /dev/null 2>&1; then \
echo "ERROR: rules/index.json is stale. Run 'make build-index' and commit the result."; \
diff -u rules/index.json /tmp/coding-rules-index-check.json | head -40; \
rm -f /tmp/coding-rules-index-check.json; \
exit 1; \
fi
@rm -f /tmp/coding-rules-index-check.json
@echo "rules/index.json up-to-date"
```

Wired into precommit:

```make
precommit: check-links check-json check-index
```

Why diff-not-regenerate

Considered: just run `build-index` in precommit. Rejected — that mutates a committed file silently; precommit "passes" but leaves an uncommitted change in the working tree. Diff-check fails loud and forces the author to run `make build-index` and `git add` it explicitly.

Test plan

  • Clean state (master HEAD): `make precommit` passes (`rules/index.json up-to-date`)
  • Synthetic drift (appended `{}` to `rules/index.json`): fails with diff + non-zero exit
  • Restored state: `make precommit` passes again
  • Walker is deterministic ("sorted, byte-stable" per its header) → diff cannot false-positive on insertion-order changes

PR #9 trimmed a doc that contained pre-canonicalized rule blocks but
shipped without running 'make build-index'. The 7 new agent-cmd/*
entries were silently absent from rules/index.json until PR #10 picked
them up retroactively. The lesson: walker-output drift is silent at
review time; rules/index.json passes JSON validation regardless.

Fix: add 'check-index' to the precommit chain. The target regenerates
the walker output to a tmp file, diffs against the committed index,
and fails with the diff + a 'run make build-index and commit' message
if they differ.

Behavior verified:
- Clean state (master HEAD c07b809): precommit passes
- Synthetic drift (extra line appended to rules/index.json): precommit
  fails with diff output and non-zero exit code

This closes the gap that turned a one-PR change (PR #9) into a two-PR
walker-regen catch-up (PR #10). Future trim/bootstrap PRs that touch
rule blocks will fail precommit locally if the walker output isn't
also committed.

@ben-s-pull-request-reviewer ben-s-pull-request-reviewer Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
  "verdict": "approve",
  "summary": "The check-index target is a well-designed addition that detects when rules/index.json has drifted from its source. The two concerns from planning (diff -q exit code reliability and temp file cleanup) were analyzed and found to be correctly handled in the implementation. One minor observation: if build-index.py fails, Make's default behavior (without set -e) allows the diff step to run against an incomplete temp file, but this is unlikely in practice and the error output would still be informative.",
  "comments": [
    {
      "file": "Makefile",
      "line": 32,
      "severity": "nit",
      "message": "Nice to have: Consider adding '|| true' or explicit error handling after python3 build-index.py to ensure failures are caught. Without set -e, Make continues past non-zero exits from command substitutions. This is minor since build-index.py failures would be caught by pre-existing checks."
    }
  ],
  "concerns_addressed": [
    "correctness: diff -q exit code handling is reliable across bash versions — confirmed",
    "correctness: temp file /tmp/coding-rules-index-check.json IS cleaned up before exit 1 in error path — concern not valid"
  ]
}

@bborbe bborbe merged commit 8915952 into master Jun 2, 2026
1 check passed
@bborbe bborbe deleted the feat/precommit-index-check branch June 2, 2026 09:50
bborbe added a commit that referenced this pull request Jun 2, 2026
Restructures the 4 MUST sections in docs/go-licensing-guide.md into
canonical `### RULE` blocks. Same template as PRs #2-5, #8, #10.

Rules added:

- go-licensing/license-file-required (MUST, judgment)
  Public Go repos MUST have a root LICENSE file. Private/internal repos
  exempt. Detection requires file-existence check — not ast-grep.
- go-licensing/readme-license-section-required (MUST, judgment)
  Public README must have a ## License H2 section pointing to LICENSE.
  Markdown structure check — not ast-grep.
- go-licensing/source-file-header-required (MUST, addlicense)
  All non-vendor *.go files need the 3-line BSD-2 header. Enforcement
  delegates to 'addlicense -check' via make precommit (canonical tool;
  already wired into the toolchain).
- go-licensing/copyright-year-discipline (MUST, judgment)
  No bulk year-updates for trivial changes; no future/non-numeric
  years. PR-scope diff inspection — partial ast-grep detection
  possible for the future-year case but bulk-update trigger needs
  diff-scope reasoning.

rules/index.json: 42 -> 46

Walker drift detected by PR #13's check-index target during first
precommit run — regenerated rules/index.json and re-ran precommit
clean. New guard working as designed.

Pre-emptive checks (lessons from #6, #8): no personal vault paths,
no trading-domain terms, no internal contradictions, all cross-refs
resolve. License-assistant agent exists at agents/license-assistant.md.
bborbe added a commit that referenced this pull request Jun 2, 2026
PR #13's check-index target caught a precommit failure on 080a344 —
the doc edit changed the applies_when text but I committed without
running 'make build-index' first. The drift guard worked as designed.
This commit picks up the regenerated rules/index.json.
bborbe added a commit that referenced this pull request Jun 2, 2026
…dator + coverage lint

Phase 4-8 of the original task — the dispatcher unlock. The 124-rule
index now has a real pipeline consuming it.

ARCHITECTURE

Before: commands/pr-review.md loaded specific docs (Step 2.5 hardcoded
table) and dispatched a fixed 4-agent list (Step 4). Each agent
re-implemented its own pattern detection on top of reading its
specific doc — no use of rules/index.json or rules/<lang>/*.yml.

After: dispatcher invokes a thin mechanical-funnel agent
(ast-grep-runner), receives findings grouped by Owner, then dispatches
per-Owner judgment-tier adjudication concurrently. Citation validator
drops findings citing missing rule IDs. Coverage lint catches drift
between docs, index, and YAMLs at precommit time.

WHAT'S IN THIS PR

1. agents/ast-grep-runner.md (new) — single-responsibility mechanical
   funnel. Reads rules/index.json, runs ast-grep per YAML, parses
   JSON output, groups findings by Owner. Never invokes LLM tools;
   never modifies files. Defined output contract so the dispatcher
   can pipe it directly to per-language agents.

2. commands/pr-review.md (rewritten Step 2.5 + Step 4) — now:
   - Step 4a: invoke ast-grep-runner → findings grouped by Owner
   - Step 4b: per-Owner concurrent dispatch with adjudication prompt
     containing the pre-filtered findings + judgment-tier rule IDs
     the agent owns
   - Step 4c: context-specific convention reads (kept teamvault /
     k8s-binary / k8s-manifest / changelog mappings as before — these
     are file-pattern triggers, not rule-id-driven)
   - Step 4d: citation validation via scripts/validate-citations.sh
   Per-Owner dispatch fans out concurrently — N agents see N
   independent finding sets, no serialization.

3. scripts/validate-citations.sh (new) — filters review-output
   findings against rules/index.json's rule_id set. Drops findings
   with missing IDs, logs offenders to stderr, exits non-zero if any
   drops occurred (so the dispatcher captures the drift signal but
   still consolidates the validated subset).

4. scripts/check-coverage.sh (new) — coverage lint. Three checks:
   a) every rule's enforcement-cited rules/<lang>/<file>.yml path
      resolves to an existing file
   b) every rules/<lang>/*.yml is referenced by exactly one index
      entry (catches orphan YAMLs after rule renames)
   c) every YAML's id: field matches an index entry's id (catches
      rename drift between doc and YAML)
   Same shape as PR #13's check-index.

5. Makefile — precommit chain extended:
     precommit: check-links check-json check-index check-coverage
   New check-coverage target wraps the script.

6. Three per-language agents (go-error, go-time, go-context)
   simplified to expect pre-filtered findings + judgment-tier rule
   ownership. The rest of the agents continue working in the
   legacy 'scan + judge' mode for now and migrate in follow-up PRs
   — the dispatcher tolerates both shapes during the transition
   (agents that re-scan emit findings; the runner's pre-filtered
   set just doesn't overlap with theirs).

OPERATIONAL NOTES

- The dispatcher refactor is the original Phase 4-8 unlock from
  [[Refactor coding pr-review to doc-driven rules pipeline]]. With
  124 rules in the index and 15 mechanical YAMLs across rules/go/,
  the pipeline has real material to operate on from day one.

- Smoke testing the dispatcher end-to-end against an old PR is the
  next step before this lands on master; smoke fixtures live as
  judgment in /coding:pr-review invocations.

- Per-language agents not yet migrated (the other ~25) continue
  to work in legacy mode — adding their dispatcher-shaped prompt
  is a 30-min change per agent, parallelizable.

make precommit clean (links + JSON + check-index + new check-coverage).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant