Add Callaway-Sant'Anna estimator and event study visualization#14
Merged
Conversation
Features: - CallawaySantAnna: Staggered DiD estimator with group-time ATT(g,t) - Supports never-treated and not-yet-treated control groups - Three estimation methods: doubly-robust, IPW, outcome regression - Aggregations: event study (by relative time), by cohort, overall ATT - plot_event_study: Publication-ready coefficient plots - Works with MultiPeriodDiD, CallawaySantAnna, or DataFrames - Customizable colors, markers, shading, and reference lines - plot_group_effects: Visualize effects by treatment cohort API improvements: - Export TwoWayFixedEffects in main package - Export parallel trends utilities: check_parallel_trends, check_parallel_trends_robust, equivalence_test_trends - Version bump to 0.4.0 Tests: - 17 tests for CallawaySantAnna covering all major functionality - 14 tests for visualization (skipped when matplotlib unavailable) Documentation: - README updated with CallawaySantAnna examples and usage guide - README updated with plot_event_study documentation - TODO.md created with roadmap for future features
igerber
pushed a commit
that referenced
this pull request
Jan 3, 2026
Review of Callaway-Sant'Anna estimator and event study visualization. Overall assessment: Approve with minor changes (linting issues).
Fixes based on code review in PR #14: 1. Remove unused imports in staggered.py: - Removed Union from typing - Removed scipy.stats import - Removed validate_binary, compute_robust_se from utils imports 2. Fix forward references in visualization.py: - Added TYPE_CHECKING imports for MultiPeriodDiDResults and CallawaySantAnnaResults to resolve F821 linter errors 3. Remove unused variable in _aggregate_by_group: - Removed unused `ns` array (group weighting not needed for within-group aggregation across time periods) 4. Handle unimplemented features: - Raise NotImplementedError if n_bootstrap > 0 - Add UserWarning when covariates are provided (not yet used) 5. Import order fixed by ruff
4 tasks
igerber
added a commit
that referenced
this pull request
May 15, 2026
R10 review flagged shell-script direct execution (`bash <script>`, `source <script>`, `./<script>`) and multi-line `python3 -c` bodies as unmodeled. After 10 rounds of progressive guard tightening (R0-R9), the trajectory is clear: each round adds a new shell-feature to model and the reviewer finds the next gap. Continuing further has diminishing return — static-shell-parsing has irreducible limits. Per discussion with author, the 16-test guard is accepted as 'reasonable defense against accidental regressions' rather than a complete adversarial parser. The dismissal's primary defense is the documented threat model (workflow comment block + dismissed_comment field on alert #14), not the test. This commit: - Soften the workflow comment block's "CI fails closed" line. Replace with explicit enumeration of what the test catches (accidental regressions of common forms) AND what it does NOT model (script execution, multi-line python -c, var expansion, eval, find -exec, xargs). Frame the test as belt-and-suspenders. - Expand TestWorkflowDoesNotExecutePRHeadCode docstring with the same SCOPE / NOT-MODELED enumeration. - Add TODO.md row tracking the residuals + reasoning for accepting them. Test invariants preserved (test_workflow_dismissal_comment_block_present still passes — the required strings remain present). 16 guard tests pass; YAML still parses. This is the final commit on PR #436. No more guard-coverage expansion. CodeQL #14 dismissal will be applied via `gh api PATCH` post-merge.
igerber
added a commit
that referenced
this pull request
May 15, 2026
Dismiss CodeQL #14 (untrusted-checkout-toctou) with guard test
TDL77
pushed a commit
to TDL77/diff-diff
that referenced
this pull request
May 18, 2026
CodeQL alert igerber#14 (untrusted-checkout-toctou/critical) is the third re-fire of the same rule family on this workflow (igerber#11/igerber#12 → igerber#13 → igerber#14). PR igerber#427's fork-skip gate was load-bearing for igerber#11/igerber#12 but the rule re-classified as igerber#14 at a shifted line range with escalated severity. A structural fix (checkout BASE_SHA only; `git show` for PR-head reads) would close the rule cleanly but degrade reviewer quality: pr_review.md explicitly instructs Codex to `grep -n "pattern" diff_diff/*.py` for pattern-consistency checks, and Codex's `sandbox: read-only` lets it inspect workspace files. Under restructure, those operations would search BASE content, missing patterns added in the PR. Workarounds that materialize PR-head content via non-`actions/checkout` mechanisms (git worktree, overlay) satisfy CodeQL's pattern matcher without changing runtime risk — security theater dressed up. The honest path: dismiss with documented rationale + guard test. The actual threat model accepted by the dismissal: - Codex runs with sandbox: read-only — can read PR-head files, cannot execute or write them - head_sha is API-pinned in the resolve-pr step before checkout; TOCTOU window is sub-second and bounded by the is_fork gate - Same-repo PR contributors who can push to PR head branches already have write access to push to main; the checkout doesn't expand attack surface - Fork PRs are skipped entirely (PR igerber#427 fork-skip gate) The dismissal becomes invalid only if a future workflow edit adds a step that EXECUTES PR-head content. Guard test TestWorkflowDoesNotExecutePRHeadCode in tests/test_openai_review.py fails CI on any of: pip install, pytest, npm install, cargo run/test, make, ./configure, maturin develop/build, poetry install/run, pdm, uv sync/run, tox, setup.py, etc. The maturin entries are load-bearing for this repo (Rust build per CLAUDE.md is the most likely future regression vector). The dismissal API call (gh api PATCH .../code-scanning/alerts/14) will be run post-merge, separately from this PR.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Features:
API improvements:
check_parallel_trends_robust, equivalence_test_trends
Tests:
Documentation: