Add Callaway-Sant'Anna estimator and event study visualization by igerber · Pull Request #14 · igerber/diff-diff

igerber · 2026-01-03T16:30:00Z

Features:

CallawaySantAnna: Staggered DiD estimator with group-time ATT(g,t)
- Supports never-treated and not-yet-treated control groups
- Three estimation methods: doubly-robust, IPW, outcome regression
- Aggregations: event study (by relative time), by cohort, overall ATT
plot_event_study: Publication-ready coefficient plots
- Works with MultiPeriodDiD, CallawaySantAnna, or DataFrames
- Customizable colors, markers, shading, and reference lines
plot_group_effects: Visualize effects by treatment cohort

API improvements:

Export TwoWayFixedEffects in main package
Export parallel trends utilities: check_parallel_trends,
check_parallel_trends_robust, equivalence_test_trends
Version bump to 0.4.0

Tests:

17 tests for CallawaySantAnna covering all major functionality
14 tests for visualization (skipped when matplotlib unavailable)

Documentation:

README updated with CallawaySantAnna examples and usage guide
README updated with plot_event_study documentation
TODO.md created with roadmap for future features

Features: - CallawaySantAnna: Staggered DiD estimator with group-time ATT(g,t) - Supports never-treated and not-yet-treated control groups - Three estimation methods: doubly-robust, IPW, outcome regression - Aggregations: event study (by relative time), by cohort, overall ATT - plot_event_study: Publication-ready coefficient plots - Works with MultiPeriodDiD, CallawaySantAnna, or DataFrames - Customizable colors, markers, shading, and reference lines - plot_group_effects: Visualize effects by treatment cohort API improvements: - Export TwoWayFixedEffects in main package - Export parallel trends utilities: check_parallel_trends, check_parallel_trends_robust, equivalence_test_trends - Version bump to 0.4.0 Tests: - 17 tests for CallawaySantAnna covering all major functionality - 14 tests for visualization (skipped when matplotlib unavailable) Documentation: - README updated with CallawaySantAnna examples and usage guide - README updated with plot_event_study documentation - TODO.md created with roadmap for future features

Review of Callaway-Sant'Anna estimator and event study visualization. Overall assessment: Approve with minor changes (linting issues).

Fixes based on code review in PR #14: 1. Remove unused imports in staggered.py: - Removed Union from typing - Removed scipy.stats import - Removed validate_binary, compute_robust_se from utils imports 2. Fix forward references in visualization.py: - Added TYPE_CHECKING imports for MultiPeriodDiDResults and CallawaySantAnnaResults to resolve F821 linter errors 3. Remove unused variable in _aggregate_by_group: - Removed unused `ns` array (group weighting not needed for within-group aggregation across time periods) 4. Handle unimplemented features: - Raise NotImplementedError if n_bootstrap > 0 - Add UserWarning when covariates are provided (not yet used) 5. Import order fixed by ruff

R10 review flagged shell-script direct execution (`bash <script>`, `source <script>`, `./<script>`) and multi-line `python3 -c` bodies as unmodeled. After 10 rounds of progressive guard tightening (R0-R9), the trajectory is clear: each round adds a new shell-feature to model and the reviewer finds the next gap. Continuing further has diminishing return — static-shell-parsing has irreducible limits. Per discussion with author, the 16-test guard is accepted as 'reasonable defense against accidental regressions' rather than a complete adversarial parser. The dismissal's primary defense is the documented threat model (workflow comment block + dismissed_comment field on alert #14), not the test. This commit: - Soften the workflow comment block's "CI fails closed" line. Replace with explicit enumeration of what the test catches (accidental regressions of common forms) AND what it does NOT model (script execution, multi-line python -c, var expansion, eval, find -exec, xargs). Frame the test as belt-and-suspenders. - Expand TestWorkflowDoesNotExecutePRHeadCode docstring with the same SCOPE / NOT-MODELED enumeration. - Add TODO.md row tracking the residuals + reasoning for accepting them. Test invariants preserved (test_workflow_dismissal_comment_block_present still passes — the required strings remain present). 16 guard tests pass; YAML still parses. This is the final commit on PR #436. No more guard-coverage expansion. CodeQL #14 dismissal will be applied via `gh api PATCH` post-merge.

Dismiss CodeQL #14 (untrusted-checkout-toctou) with guard test

CodeQL alert igerber#14 (untrusted-checkout-toctou/critical) is the third re-fire of the same rule family on this workflow (igerber#11/igerber#12 → igerber#13 → igerber#14). PR igerber#427's fork-skip gate was load-bearing for igerber#11/igerber#12 but the rule re-classified as igerber#14 at a shifted line range with escalated severity. A structural fix (checkout BASE_SHA only; `git show` for PR-head reads) would close the rule cleanly but degrade reviewer quality: pr_review.md explicitly instructs Codex to `grep -n "pattern" diff_diff/*.py` for pattern-consistency checks, and Codex's `sandbox: read-only` lets it inspect workspace files. Under restructure, those operations would search BASE content, missing patterns added in the PR. Workarounds that materialize PR-head content via non-`actions/checkout` mechanisms (git worktree, overlay) satisfy CodeQL's pattern matcher without changing runtime risk — security theater dressed up. The honest path: dismiss with documented rationale + guard test. The actual threat model accepted by the dismissal: - Codex runs with sandbox: read-only — can read PR-head files, cannot execute or write them - head_sha is API-pinned in the resolve-pr step before checkout; TOCTOU window is sub-second and bounded by the is_fork gate - Same-repo PR contributors who can push to PR head branches already have write access to push to main; the checkout doesn't expand attack surface - Fork PRs are skipped entirely (PR igerber#427 fork-skip gate) The dismissal becomes invalid only if a future workflow edit adds a step that EXECUTES PR-head content. Guard test TestWorkflowDoesNotExecutePRHeadCode in tests/test_openai_review.py fails CI on any of: pip install, pytest, npm install, cargo run/test, make, ./configure, maturin develop/build, poetry install/run, pdm, uv sync/run, tox, setup.py, etc. The maturin entries are load-bearing for this repo (Rust build per CLAUDE.md is the most likely future regression vector). The dismissal API call (gh api PATCH .../code-scanning/alerts/14) will be run post-merge, separately from this PR.

claude added 2 commits January 3, 2026 16:24

Update CLAUDE.md with new module documentation

9352205

igerber pushed a commit that referenced this pull request Jan 3, 2026

Add code review for PR #14

5cd6aed

Review of Callaway-Sant'Anna estimator and event study visualization. Overall assessment: Approve with minor changes (linting issues).

claude added 2 commits January 3, 2026 16:52

Document CallawaySantAnna current limitations in README

d124a02

igerber merged commit fe9fd5a into main Jan 3, 2026

igerber deleted the claude/research-library-features-bru1W branch January 3, 2026 16:57

igerber mentioned this pull request May 14, 2026

Dismiss CodeQL #14 (untrusted-checkout-toctou) with guard test #436

Merged

4 tasks

igerber added a commit that referenced this pull request May 15, 2026

Merge pull request #436 from igerber/dismiss-codeql-14

c56dbcb

Dismiss CodeQL #14 (untrusted-checkout-toctou) with guard test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Callaway-Sant'Anna estimator and event study visualization#14

Add Callaway-Sant'Anna estimator and event study visualization#14
igerber merged 4 commits into
mainfrom
claude/research-library-features-bru1W

igerber commented Jan 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

igerber commented Jan 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants