fix(migrations): drop CONCURRENTLY from 0201 indexes#830
Merged
Conversation
CREATE INDEX CONCURRENTLY waits on virtualxid for every transaction open during its build phases — not just transactions touching the target table, but every one. On this DB the legacy Python index_rewards_manager Celery task on discovery-provider keeps long (~3-min) transactions open against challenge_disbursements continuously, so as one ends another is already open, and the CONCURRENTLY build can wait indefinitely without ever seeing a quiet moment. Switch to regular CREATE INDEX inside the migration's BEGIN/COMMIT. This takes a ShareLock on the target table — writes are blocked for the build duration — but both target tables (sol_reward_disbursements, sol_claimable_accounts) have light write load (only the Go indexer writes them, and only on reward_manager EvaluateAttestations and claimable_token Create instructions, which are sparse on-chain). At current row counts the build completes in seconds; the blocked writes just queue on pgxpool and resume immediately after. Also folds the indexes back inside the BEGIN/COMMIT so they're atomic with the backfill — if the INSERT fails, the indexes are rolled back too, leaving the schema clean for a retry.
5 tasks
raymondjacobson
added a commit
that referenced
this pull request
May 19, 2026
#832) ## Summary ISRC lookup via `GET /v1/tracks?isrc=…` failed when the stored value and the query disagreed on dash placement. Partner report (Thomas): Fixed purely on the search/query side (no normalization at write time): - Normalize the incoming `isrc` query param in Go: uppercase + strip non-alphanumeric characters before passing to the DB. - In the SQL, compare against `regexp_replace(upper(isrc), '[^A-Z0-9]', '', 'g')` so the stored value is normalized at compare time. Net result: any combination of dashes/spaces/casing in either the request or the stored row matches. `US-ANG-21-03742`, `USANG2103742`, `usang2103742`, and `US-ANG2103742` all resolve to the same row. ## Index Added `tracks_isrc_normalized_idx`, a **partial functional index** matching the comparison expression so the new query can index-scan instead of seq-scanning all of `tracks`: ```sql create index concurrently if not exists tracks_isrc_normalized_idx on public.tracks ((regexp_replace(upper(isrc), '[^A-Z0-9]'::text, ''::text, 'g'))) where isrc is not null; ``` Verified planner picks it on a 50k-row scratch table: ``` Bitmap Heap Scan on scratch_tracks (cost=16.43..248.18 rows=500 width=4) Recheck Cond: (regexp_replace(upper(isrc), '[^A-Z0-9]'::text, ''::text, 'g'::text) = ANY (...)) -> Bitmap Index Scan on scratch_tracks_isrc_norm_idx (cost=0.00..16.30 rows=500 width=0) Index Cond: (regexp_replace(upper(isrc), '[^A-Z0-9]'::text, ''::text, 'g'::text) = ANY (...)) ``` Index DDL uses `CREATE INDEX CONCURRENTLY` (no `BEGIN/COMMIT` wrapper) so the build does not take an `ACCESS EXCLUSIVE` lock on `tracks`. `IF NOT EXISTS` keeps the migration idempotent. Partial on `isrc IS NOT NULL` since most rows have no ISRC. Heads-up given #830: that PR moved `0201` away from `CONCURRENTLY` because the legacy Python Celery task on `challenge_disbursements` was keeping ~3-minute transactions open continuously, blocking the build's `virtualxid` wait indefinitely. `tracks` is written by the Go indexer and isn't subject to that long-transaction pattern, so `CONCURRENTLY` should complete normally here. If a stuck build needs to be aborted, drop the invalid index with `DROP INDEX IF EXISTS tracks_isrc_normalized_idx;` and re-run. ## ISWC There is no `iswc` query-param lookup today (column exists and is exposed in API responses, but no handler or sqlc query reads it). Nothing to mirror — left as-is. ## Test plan - [x] `TestGetTracksByISRC` covers: stored-with-dashes queried undashed, queried with same dashes, queried lowercased; stored-without-dashes queried as-is and queried with inserted dashes. - [x] Existing tracks tests (`TestTracksEndpoint`, `TestGetTracksByPermalink`, `TestGetTracksExcludesAccessAuthorities`, `Test200UnAuthed`) still pass. - [x] Migration applies cleanly and is idempotent (re-running prints `relation \"tracks_isrc_normalized_idx\" already exists, skipping`). - [x] `EXPLAIN` on a seeded scratch table shows `Bitmap Index Scan on tracks_isrc_normalized_idx`. - [ ] After deploy: verify `SELECT indexrelid::regclass, indisvalid FROM pg_index WHERE indexrelid::regclass::text = 'tracks_isrc_normalized_idx';` returns `indisvalid = true`. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
0201_backfill_missing_reward_disbursements.sqlfromCREATE INDEX CONCURRENTLYto plainCREATE INDEXinside the migration'sBEGIN/COMMIT.sol_reward_disbursements (challenge_id, specifier)andsol_claimable_accounts (ethereum_address, mint, slot DESC)) are now atomic with the backfill INSERT — if anything fails, the schema rolls back cleanly.Why
CREATE INDEX CONCURRENTLYwaits on avirtualxidlock for every transaction open during its build phases — not just transactions that touch the target table, but every one in the cluster.The legacy Python
index_rewards_managerCelery task on discovery-provider keeps ~3-minute transactions open againstchallenge_disbursementscontinuously. As fast as one ends, another is already open. So the CONCURRENTLY build can wait indefinitely without ever seeing a quiet moment — and it did, for 10+ minutes blocked onLock/virtualxidin tonight's deploy.Trade-off accepted: regular
CREATE INDEXtakes aShareLockon the target table for the duration of the build, blocking writes. But both target tables are written only by the Go indexer, and only on reward_managerEvaluateAttestationsand claimable tokenCreateinstructions — sparse on-chain. At current row counts each build completes in seconds; the blocked writes just queue on pgxpool and resume right after.Test plan
sql SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE query ILIKE 'CREATE INDEX CONCURRENTLY%'; DROP INDEX IF EXISTS sol_reward_disbursements_challenge_specifier_idx; DROP INDEX IF EXISTS sol_claimable_accounts_eth_mint_slot_idx;bridge migrateshould complete in well under a minute.indisvalid = true:sql SELECT indexrelid::regclass, indisvalid FROM pg_index WHERE indexrelid::regclass::text IN ( 'sol_reward_disbursements_challenge_specifier_idx', 'sol_claimable_accounts_eth_mint_slot_idx' );🤖 Generated with Claude Code