You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Full audit of postgres_dba (34 SQL reports + harness + CI + docs) on branch claude/review-legacy-project-pNJS1.
Testing performed
All 34 reports executed against a local PostgreSQL 16.13 (latest available locally; network blocked for PGDG/PG18) with a minimally-privileged dba_user (pg_monitor only), in both wide and normal modes, with and without required extensions.
Static analysis of every .sql, .psql, and shell script.
CI workflow audit (.github/workflows/test.yml).
Cross-check of start.psql vs sql/*.sql vs init/generate.sh.
Environment caveat: couldn't spin up PG 17/18 locally (no Docker, no PGDG repo access). CI covers PG 13–18 so runtime bugs specific to those versions may exist beyond what I could reproduce — flagged where applicable.
🔴 High severity — functional bugs / security
1. t1_tuning.sql errors when postgres_dba_interactive_mode isn't set
sql/t1_tuning.sql:7 does \if :postgres_dba_interactive_mode, but this variable is never initialized in warmup.psql. It's only set via ~/.psqlrc in the CI test harness.
Reproduces:
$ psql -f warmup.psql -f sql/t1_tuning.sql
psql:sql/t1_tuning.sql:7: error: unrecognized value ":postgres_dba_interactive_mode"
for "\if expression": Boolean expected
A user invoking this report outside the menu (e.g. documented "copy-paste to run standalone") hits an error. Fix: default it in warmup.psql:
2. i3_non_indexed_fks.sql hides FKs on small tables silently
sql/i3_non_indexed_fks.sql:118: hardcoded filter table_mb > 9 and (writes > 1000 or parent_writes > 1000 or parent_mb > 10).
On a fresh or small DB, the report returns zero rows even when FKs without indexes clearly exist. I had to grow fk_child past 9 MB locally before the report found the missing index. No documentation or hint to the user that results were filtered out.
The CI regression test (test.yml) only works because it inserts 200k child rows to cross the threshold. Real DBAs triaging a real issue will be confused.
Fix options: (a) lower/remove the threshold; (b) expose it as a psql variable; (c) surface a note when results are filtered.
3. Reports crash with raw SQL errors when required extensions are missing
Only the amcheck reports (c1–c4) handle the missing-extension case gracefully. Everything else emits an unhelpful catalog error:
Report
Extension
Error when missing
s1, s2, s3
pg_stat_statements
relation "pg_stat_statements" does not exist
m1
pg_buffercache
relation "pg_buffercache" does not exist
b3, b4
pgstattuple
function pgstattuple(oid) does not exist
c1's do $$ if not exists (select 1 from pg_extension ...) $$ pattern should be ported.
4. Password generators use non-cryptographic random()
misc/generate_password.sql even has a standing -- TODO: rework to use pgcrypto instead and its own warning: "random() that is used here is not cryptographically strong — if an attacker knows one value, it's easy to guess the 'next' value." The TODO is unaddressed; the same flaw exists in r1/r2 without the warning.
Fix: replace with gen_random_bytes() (pgcrypto) or gen_random_uuid()-based entropy.
There's also a subtle off-by-one: j := int4(random() * allowed_len) can equal allowed_len, making substr(allowed, j+1, 1) return '' — so the password can be shorter than 16 chars (rare but possible).
5. i5_indexes_migration.sql emits unquoted identifiers in DROP/CREATE DDL
sql/i5_indexes_migration.sql:35 builds schemaname || '.' || indexrelname::text with no quote_ident. Line 107 then embeds that straight into a DROP INDEX CONCURRENTLY %s; format.
Schemas or indexes with mixed case, spaces, or reserved words will generate DDL that either fails or targets the wrong object. Since this report exists to hand DBAs a migration to paste into production, this is high-impact. Fix: quote_ident(schemaname) || '.' || quote_ident(indexrelname).
6. i2_redundant_indexes.sql — fk_indexes CTE filter excludes heavily-used FK indexes
sql/i2_redundant_indexes.sql:30: and si.idx_scan < 10.
The fk_indexes CTE is supposed to enumerate FK-backing indexes so the supports_fk column tells users whether dropping a candidate is safe. But the filter drops any FK index with ≥ 10 scans — i.e. the well-used ones. That causes supports_fk = false for indexes that do back an FK, which leads users to drop indexes that silently break FK-enforcement performance. Either remove that filter or invert its intent.
7. alter_user_with_random_password.psql executes ALTER ROLE x statements with an empty clause
Lines 49–58: when the user answers "no" to superuser / login, the case emits an empty string and the generated SQL becomes alter role "x" — an effective no-op but still a wasted statement (and confusing raise debug output). More importantly: there's no branch that revokessuperuser/login, so the DDL is asymmetric (says yes → grants the attribute; says no → does nothing). Users trying to remove superuser via this script will be silently unsuccessful.
case when version() ~ 'mingw32|64-bit|x86_64|ppc64|ia64|amd64' then 8 else 4 end as ma, -- NS: TODO: check it
No aarch64 / arm64. AWS Graviton, Azure Ampere, and any PG running on Apple Silicon (via Postgres.app, homebrew) will be detected as 32-bit and get the wrong memory alignment → incorrect bloat estimates. pg_stat_user_tables isn't used to infer this; pg_controldata's MAXALIGN or reserved_connections trick would be better, or at least arm64|aarch64 should be added to the regex.
9. s1/s2 assume pg_stat_statements extension version matches server version
sql/s1_pg_stat_statements_top_total.sql:56 branches on :postgres_dba_pgvers_17plus to use shared_blk_read_time. But shared_blk_read_time was introduced in pg_stat_statements extension v1.11, which ships with PG 17 — a user on PG 17 who upgraded without running ALTER EXTENSION pg_stat_statements UPDATE still has extension 1.10, where only blk_read_time exists. Query will error.
Fix: gate on (select extversion from pg_extension where extname='pg_stat_statements') ≥ '1.11', not server version.
10. CLAUDE.md style violations merged to master
CLAUDE.md states: "<> not !=" and "Lowercase keywords (select, from, where — not SELECT, FROM, WHERE)". Current state:
!= violations (9): all recent additions (v7.0):
sql/c1_amcheck_indexes.sql:42, 85
sql/c2_amcheck_heap.sql:47, 90, 136
sql/c3_amcheck_parent.sql:48
sql/c4_amcheck_full.sql:55, 113
sql/t2_storage_parameters.sql:29
Uppercase keywords in executable SQL (not comments), file / count:
File
Approx count
sql/i1_rare_indexes.sql
42
sql/b2_btree_estimation.sql
24
sql/b5_tables_no_stats.sql
18
sql/b1_table_estimation.sql
9
sql/i5_indexes_migration.sql
5
sql/i2_redundant_indexes.sql
4
sql/i4_invalid_indexes.sql
3
Others (b3, b4, c1-4, v2)
1-2 each
Most are from the original pgexperts/ioguix sources, copied verbatim. If the rule is serious, there's a one-time cleanup to do; if it only applies to new code, that should be in CLAUDE.md. A lint step in CI (simple grep) would prevent regression.
11. 0_node.sql — Stats Since / Stats Age render as empty strings when stats_reset is null
New databases / never-reset stats produce:
Stats Since |
Stats Age |
Fine data, ugly display. coalesce(..., 'never reset') fixes it.
12. pg_ls_waldir() perms not documented
sql/0_node.sql:56. Requires pg_monitor or superuser. The README says pg_monitor is enough for "most" reports — 0_node.sql happens to fall in "most" only because it joins pg_monitor's perms. Users with bare pg_read_all_stats will get a permission error mid-UNION with no hint which subquery failed.
13. start.psql is committed generated code — drift is possible
223 lines of \elif :d_step_is_XXX branches, regenerated by init/generate.sh. Currently in sync (I verified), but there's no CI check that git diff --exit-code after regenerating is clean. A contributor who adds a report and forgets bash init/generate.sh would break the menu without CI noticing.
Fix: add a generate.sh --check mode to CI.
14. Emojis in RAISE NOTICE output (c1–c4)
❌, ✅, ℹ️, ⚠️ rendered via raise notice. Works in modern terminals, may render as ?/mojibake over non-UTF8 connections, in log captures, or in some CI runners. Consider ASCII-first output with opt-in emojis via a psql variable.
🟡 Low severity — cleanup / docs
15. Stale regression fixtures in test/regression/
test/regression/0_node.out contains Role | Master — terminology was replaced with Primary project-wide in v7.0. Fixture never updated.
test/regression/p1_alignment_padding.out — the report was renamed p1 → x1 in v7.0; fixture filename never updated.
All three fixtures are unused — test.yml regression step uses inline grep, not file comparison. They're dead weight that will mislead the next contributor.
Fix: delete or actually wire them up (e.g. diff -u test/regression/0_node.out <(psql ... | grep Role)).
16. Obsolete PG 9.5 branch in warmup.psql
warmup.psql:26-34 still has a PG 9.5 compatibility branch with -- TODO: improve work with custom GUCs for Postgres 9.5 and older. README advertises PG 13–18 only; drop it. Same block is duplicated in init/generate.sh:38.
17. Missing trailing newlines
sql/r1_create_user_with_random_password.sql and sql/r2_alter_user_with_random_password.sql — POSIX text-file convention violated.
18. CI regression coverage is thin
Only 4 of 34 reports have output assertions; the other 30 are checked only for "does not error."
No negative tests (missing extension, no permissions).
No test of the superuser code paths (r1/r2 never exercised in CI).
No test of t1 interactive branch, m1 on a DB larger than 11 MB, b3/b4 on real bloat, etc.
No fixture for matviews/refresh_all.sql or misc/generate_password.sql.
Output snapshot testing infra (test/regression/) exists but unused.
19. x1_alignment_padding.sql has 4 open TODOs
-- TODO: not-yet-analyzed tables, -- TODO: NULLs, -- TODO: simplify, cleanup, -- TODO: chunk_size 4 or 8. Labeled [EXP] so acceptable, but has been experimental for a long time — either graduate it or move it out of the menu.
20. misc/generate_password.sql has --, 'md5' || md5(password || {{username}}) as password_md5 — looks like dead template-placeholder syntax
{{username}} isn't substituted anywhere. Either remove or document the intended templating step.
21. \prompt flows accept empty input
All interactive flows (r1, r2, t1) rely on users typing the exact expected token. Empty input / typos fall through silently to "no" or to downstream errors. Not a security issue, but poor UX.
22. No CONTRIBUTING.md
CLAUDE.md encodes the project's style rules but nothing tells external contributors. Likely why v7.0 merged with != violations — the rule isn't visible.
23. README.md "Adding Custom Reports" doc is incomplete
Says: "The first line must be a -- comment with the description." But init/generate.sh:66 strips only ^-- * — files like c1_amcheck_indexes.sql start with -- Corruption: ... (single dash-space-colon) and work, while b1_table_estimation.sql starts with --Table bloat (no space after dashes). Some reports keep the word after --, some don't. The generator accepts both, but users won't know that from the docs.
Test results summary
Mode
PG version
Result
Normal
16.13
34/34 reports complete without error (as pg_monitor-only user with all 5 extensions present + GRANT EXECUTE ... TO PUBLIC)
Wide
16.13
34/34 reports complete without error
Normal, no extensions
16.13
27/34 OK; 7 fail with raw errors (s1, s2, s3, m1, b3, b4 — see #3 above)
Normal, no GRANT EXECUTE, pg_monitor only
16.13
c1-c4 correctly fall through to insufficient_privilege handler and report "skipped"; other reports unaffected
CI is green on 13–18 for the happy path but doesn't catch any of the above because it runs with all extensions installed and postgres_dba_interactive_mode=false in ~/.psqlrc.
Scope
Full audit of
postgres_dba(34 SQL reports + harness + CI + docs) on branchclaude/review-legacy-project-pNJS1.Testing performed
dba_user(pg_monitoronly), in bothwideandnormalmodes, with and without required extensions..sql,.psql, and shell script..github/workflows/test.yml).start.psqlvssql/*.sqlvsinit/generate.sh.Environment caveat: couldn't spin up PG 17/18 locally (no Docker, no PGDG repo access). CI covers PG 13–18 so runtime bugs specific to those versions may exist beyond what I could reproduce — flagged where applicable.
🔴 High severity — functional bugs / security
1.
t1_tuning.sqlerrors whenpostgres_dba_interactive_modeisn't setsql/t1_tuning.sql:7does\if :postgres_dba_interactive_mode, but this variable is never initialized inwarmup.psql. It's only set via~/.psqlrcin the CI test harness.Reproduces:
A user invoking this report outside the menu (e.g. documented "copy-paste to run standalone") hits an error. Fix: default it in
warmup.psql:2.
i3_non_indexed_fks.sqlhides FKs on small tables silentlysql/i3_non_indexed_fks.sql:118: hardcoded filtertable_mb > 9 and (writes > 1000 or parent_writes > 1000 or parent_mb > 10).On a fresh or small DB, the report returns zero rows even when FKs without indexes clearly exist. I had to grow
fk_childpast 9 MB locally before the report found the missing index. No documentation or hint to the user that results were filtered out.The CI regression test (
test.yml) only works because it inserts 200k child rows to cross the threshold. Real DBAs triaging a real issue will be confused.Fix options: (a) lower/remove the threshold; (b) expose it as a psql variable; (c) surface a note when results are filtered.
3. Reports crash with raw SQL errors when required extensions are missing
Only the amcheck reports (
c1–c4) handle the missing-extension case gracefully. Everything else emits an unhelpful catalog error:relation "pg_stat_statements" does not existrelation "pg_buffercache" does not existfunction pgstattuple(oid) does not existc1'sdo $$ if not exists (select 1 from pg_extension ...) $$pattern should be ported.4. Password generators use non-cryptographic
random()roles/create_user_with_random_password.psql:42-45,roles/alter_user_with_random_password.psql:42-45,misc/generate_password.sql:1-3.misc/generate_password.sqleven has a standing-- TODO: rework to use pgcrypto insteadand its own warning: "random() that is used here is not cryptographically strong — if an attacker knows one value, it's easy to guess the 'next' value." The TODO is unaddressed; the same flaw exists in r1/r2 without the warning.Fix: replace with
gen_random_bytes()(pgcrypto) orgen_random_uuid()-based entropy.There's also a subtle off-by-one:
j := int4(random() * allowed_len)can equalallowed_len, makingsubstr(allowed, j+1, 1)return''— so the password can be shorter than 16 chars (rare but possible).5.
i5_indexes_migration.sqlemits unquoted identifiers in DROP/CREATE DDLsql/i5_indexes_migration.sql:35buildsschemaname || '.' || indexrelname::textwith noquote_ident. Line 107 then embeds that straight into aDROP INDEX CONCURRENTLY %s;format.Schemas or indexes with mixed case, spaces, or reserved words will generate DDL that either fails or targets the wrong object. Since this report exists to hand DBAs a migration to paste into production, this is high-impact. Fix:
quote_ident(schemaname) || '.' || quote_ident(indexrelname).6.
i2_redundant_indexes.sql—fk_indexesCTE filter excludes heavily-used FK indexessql/i2_redundant_indexes.sql:30:and si.idx_scan < 10.The
fk_indexesCTE is supposed to enumerate FK-backing indexes so thesupports_fkcolumn tells users whether dropping a candidate is safe. But the filter drops any FK index with ≥ 10 scans — i.e. the well-used ones. That causessupports_fk = falsefor indexes that do back an FK, which leads users to drop indexes that silently break FK-enforcement performance. Either remove that filter or invert its intent.7.
alter_user_with_random_password.psqlexecutesALTER ROLE xstatements with an empty clauseLines 49–58: when the user answers "no" to superuser / login, the
caseemits an empty string and the generated SQL becomesalter role "x"— an effective no-op but still a wasted statement (and confusingraise debugoutput). More importantly: there's no branch that revokessuperuser/login, so the DDL is asymmetric (says yes → grants the attribute; says no → does nothing). Users trying to remove superuser via this script will be silently unsuccessful.🟠 Medium severity — correctness / compatibility
8.
b1_table_estimation.sqlarchitecture detection misses ARM64sql/b1_table_estimation.sql:22:No
aarch64/arm64. AWS Graviton, Azure Ampere, and any PG running on Apple Silicon (via Postgres.app, homebrew) will be detected as 32-bit and get the wrong memory alignment → incorrect bloat estimates.pg_stat_user_tablesisn't used to infer this;pg_controldata'sMAXALIGNorreserved_connectionstrick would be better, or at leastarm64|aarch64should be added to the regex.9.
s1/s2assumepg_stat_statementsextension version matches server versionsql/s1_pg_stat_statements_top_total.sql:56branches on:postgres_dba_pgvers_17plusto useshared_blk_read_time. Butshared_blk_read_timewas introduced in pg_stat_statements extension v1.11, which ships with PG 17 — a user on PG 17 who upgraded without runningALTER EXTENSION pg_stat_statements UPDATEstill has extension 1.10, where onlyblk_read_timeexists. Query will error.Fix: gate on
(select extversion from pg_extension where extname='pg_stat_statements')≥ '1.11', not server version.10. CLAUDE.md style violations merged to master
CLAUDE.md states: "
<>not!=" and "Lowercase keywords (select, from, where — not SELECT, FROM, WHERE)". Current state:!=violations (9): all recent additions (v7.0):sql/c1_amcheck_indexes.sql:42, 85sql/c2_amcheck_heap.sql:47, 90, 136sql/c3_amcheck_parent.sql:48sql/c4_amcheck_full.sql:55, 113sql/t2_storage_parameters.sql:29Uppercase keywords in executable SQL (not comments), file / count:
sql/i1_rare_indexes.sqlsql/b2_btree_estimation.sqlsql/b5_tables_no_stats.sqlsql/b1_table_estimation.sqlsql/i5_indexes_migration.sqlsql/i2_redundant_indexes.sqlsql/i4_invalid_indexes.sqlMost are from the original pgexperts/ioguix sources, copied verbatim. If the rule is serious, there's a one-time cleanup to do; if it only applies to new code, that should be in CLAUDE.md. A
lintstep in CI (simple grep) would prevent regression.11.
0_node.sql—Stats Since/Stats Agerender as empty strings whenstats_reset is nullNew databases / never-reset stats produce:
Fine data, ugly display.
coalesce(..., 'never reset')fixes it.12.
pg_ls_waldir()perms not documentedsql/0_node.sql:56. Requirespg_monitoror superuser. The README says pg_monitor is enough for "most" reports —0_node.sqlhappens to fall in "most" only because it joinspg_monitor's perms. Users with barepg_read_all_statswill get a permission error mid-UNION with no hint which subquery failed.13.
start.psqlis committed generated code — drift is possible223 lines of
\elif :d_step_is_XXXbranches, regenerated byinit/generate.sh. Currently in sync (I verified), but there's no CI check thatgit diff --exit-codeafter regenerating is clean. A contributor who adds a report and forgetsbash init/generate.shwould break the menu without CI noticing.Fix: add a
generate.sh --checkmode to CI.14. Emojis in
RAISE NOTICEoutput (c1–c4)❌,✅,ℹ️,⚠️rendered viaraise notice. Works in modern terminals, may render as?/mojibake over non-UTF8 connections, in log captures, or in some CI runners. Consider ASCII-first output with opt-in emojis via a psql variable.🟡 Low severity — cleanup / docs
15. Stale regression fixtures in
test/regression/test/regression/0_node.outcontainsRole | Master— terminology was replaced withPrimaryproject-wide in v7.0. Fixture never updated.test/regression/p1_alignment_padding.out— the report was renamedp1→x1in v7.0; fixture filename never updated.test.ymlregression step uses inlinegrep, not file comparison. They're dead weight that will mislead the next contributor.Fix: delete or actually wire them up (e.g.
diff -u test/regression/0_node.out <(psql ... | grep Role)).16. Obsolete PG 9.5 branch in
warmup.psqlwarmup.psql:26-34still has a PG 9.5 compatibility branch with-- TODO: improve work with custom GUCs for Postgres 9.5 and older. README advertises PG 13–18 only; drop it. Same block is duplicated ininit/generate.sh:38.17. Missing trailing newlines
sql/r1_create_user_with_random_password.sqlandsql/r2_alter_user_with_random_password.sql— POSIX text-file convention violated.18. CI regression coverage is thin
t1interactive branch,m1on a DB larger than 11 MB,b3/b4on real bloat, etc.matviews/refresh_all.sqlormisc/generate_password.sql.test/regression/) exists but unused.19.
x1_alignment_padding.sqlhas 4 open TODOs-- TODO: not-yet-analyzed tables,-- TODO: NULLs,-- TODO: simplify, cleanup,-- TODO: chunk_size 4 or 8. Labeled[EXP]so acceptable, but has been experimental for a long time — either graduate it or move it out of the menu.20.
misc/generate_password.sqlhas--, 'md5' || md5(password || {{username}}) as password_md5— looks like dead template-placeholder syntax{{username}}isn't substituted anywhere. Either remove or document the intended templating step.21.
\promptflows accept empty inputAll interactive flows (
r1,r2,t1) rely on users typing the exact expected token. Empty input / typos fall through silently to "no" or to downstream errors. Not a security issue, but poor UX.22. No
CONTRIBUTING.mdCLAUDE.md encodes the project's style rules but nothing tells external contributors. Likely why v7.0 merged with
!=violations — the rule isn't visible.23.
README.md"Adding Custom Reports" doc is incompleteSays: "The first line must be a
--comment with the description." Butinit/generate.sh:66strips only^-- *— files likec1_amcheck_indexes.sqlstart with-- Corruption: ...(single dash-space-colon) and work, whileb1_table_estimation.sqlstarts with--Table bloat(no space after dashes). Some reports keep the word after--, some don't. The generator accepts both, but users won't know that from the docs.Test results summary
pg_monitor-only user with all 5 extensions present +GRANT EXECUTE ... TO PUBLIC)GRANT EXECUTE,pg_monitoronlyinsufficient_privilegehandler and report "skipped"; other reports unaffectedpsql -f warmup.psql -f sql/t1_tuning.sqlCI is green on 13–18 for the happy path but doesn't catch any of the above because it runs with all extensions installed and
postgres_dba_interactive_mode=falsein~/.psqlrc.Suggested triage order
\ifcrash) — tiny patch, user-visible.c1pattern to s*, m1, b3, b4.supports_fkinversion) — risks of bad DROP recommendations.gen_random_bytesin r1/r2/misc) — address the existing TODO.!=→<>cleanup + lint in CI).Happy to open focused PRs for any of the above.