Skip to content

fix(migrations): drop CONCURRENTLY from 0201 indexes#830

Merged
rickyrombo merged 1 commit into
mainfrom
mp/drop-concurrently-from-0201
May 19, 2026
Merged

fix(migrations): drop CONCURRENTLY from 0201 indexes#830
rickyrombo merged 1 commit into
mainfrom
mp/drop-concurrently-from-0201

Conversation

@rickyrombo
Copy link
Copy Markdown
Contributor

Summary

  • Switches 0201_backfill_missing_reward_disbursements.sql from CREATE INDEX CONCURRENTLY to plain CREATE INDEX inside the migration's BEGIN/COMMIT.
  • Both indexes (sol_reward_disbursements (challenge_id, specifier) and sol_claimable_accounts (ethereum_address, mint, slot DESC)) are now atomic with the backfill INSERT — if anything fails, the schema rolls back cleanly.

Why

CREATE INDEX CONCURRENTLY waits on a virtualxid lock for every transaction open during its build phases — not just transactions that touch the target table, but every one in the cluster.

The legacy Python index_rewards_manager Celery task on discovery-provider keeps ~3-minute transactions open against challenge_disbursements continuously. As fast as one ends, another is already open. So the CONCURRENTLY build can wait indefinitely without ever seeing a quiet moment — and it did, for 10+ minutes blocked on Lock/virtualxid in tonight's deploy.

Trade-off accepted: regular CREATE INDEX takes a ShareLock on the target table for the duration of the build, blocking writes. But both target tables are written only by the Go indexer, and only on reward_manager EvaluateAttestations and claimable token Create instructions — sparse on-chain. At current row counts each build completes in seconds; the blocked writes just queue on pgxpool and resume right after.

Test plan

  • Cancel any in-flight 0201 attempt and drop any invalid index it left behind:
    sql SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE query ILIKE 'CREATE INDEX CONCURRENTLY%'; DROP INDEX IF EXISTS sol_reward_disbursements_challenge_specifier_idx; DROP INDEX IF EXISTS sol_claimable_accounts_eth_mint_slot_idx;
  • Roll the new image; migration Job's bridge migrate should complete in well under a minute.
  • Verify both indexes exist as indisvalid = true:
    sql SELECT indexrelid::regclass, indisvalid FROM pg_index WHERE indexrelid::regclass::text IN ( 'sol_reward_disbursements_challenge_specifier_idx', 'sol_claimable_accounts_eth_mint_slot_idx' );
  • Verify missing-row count drops as expected (per perf(migrations): speed up reward disbursements backfill #829's test plan).

🤖 Generated with Claude Code

CREATE INDEX CONCURRENTLY waits on virtualxid for every transaction
open during its build phases — not just transactions touching the
target table, but every one. On this DB the legacy Python
index_rewards_manager Celery task on discovery-provider keeps long
(~3-min) transactions open against challenge_disbursements
continuously, so as one ends another is already open, and the
CONCURRENTLY build can wait indefinitely without ever seeing a
quiet moment.

Switch to regular CREATE INDEX inside the migration's BEGIN/COMMIT.
This takes a ShareLock on the target table — writes are blocked for
the build duration — but both target tables (sol_reward_disbursements,
sol_claimable_accounts) have light write load (only the Go indexer
writes them, and only on reward_manager EvaluateAttestations and
claimable_token Create instructions, which are sparse on-chain). At
current row counts the build completes in seconds; the blocked writes
just queue on pgxpool and resume immediately after.

Also folds the indexes back inside the BEGIN/COMMIT so they're atomic
with the backfill — if the INSERT fails, the indexes are rolled back
too, leaving the schema clean for a retry.
@rickyrombo rickyrombo merged commit 05f5548 into main May 19, 2026
4 checks passed
@rickyrombo rickyrombo deleted the mp/drop-concurrently-from-0201 branch May 19, 2026 17:12
raymondjacobson added a commit that referenced this pull request May 19, 2026
#832)

## Summary

ISRC lookup via `GET /v1/tracks?isrc=…` failed when the stored value and
the query disagreed on dash placement. Partner report (Thomas):

Fixed purely on the search/query side (no normalization at write time):

- Normalize the incoming `isrc` query param in Go: uppercase + strip
non-alphanumeric characters before passing to the DB.
- In the SQL, compare against `regexp_replace(upper(isrc), '[^A-Z0-9]',
'', 'g')` so the stored value is normalized at compare time.

Net result: any combination of dashes/spaces/casing in either the
request or the stored row matches. `US-ANG-21-03742`, `USANG2103742`,
`usang2103742`, and `US-ANG2103742` all resolve to the same row.

## Index

Added `tracks_isrc_normalized_idx`, a **partial functional index**
matching the comparison expression so the new query can index-scan
instead of seq-scanning all of `tracks`:

```sql
create index concurrently if not exists tracks_isrc_normalized_idx
    on public.tracks ((regexp_replace(upper(isrc), '[^A-Z0-9]'::text, ''::text, 'g')))
    where isrc is not null;
```

Verified planner picks it on a 50k-row scratch table:

```
Bitmap Heap Scan on scratch_tracks  (cost=16.43..248.18 rows=500 width=4)
  Recheck Cond: (regexp_replace(upper(isrc), '[^A-Z0-9]'::text, ''::text, 'g'::text) = ANY (...))
  ->  Bitmap Index Scan on scratch_tracks_isrc_norm_idx  (cost=0.00..16.30 rows=500 width=0)
        Index Cond: (regexp_replace(upper(isrc), '[^A-Z0-9]'::text, ''::text, 'g'::text) = ANY (...))
```

Index DDL uses `CREATE INDEX CONCURRENTLY` (no `BEGIN/COMMIT` wrapper)
so the build does not take an `ACCESS EXCLUSIVE` lock on `tracks`. `IF
NOT EXISTS` keeps the migration idempotent. Partial on `isrc IS NOT
NULL` since most rows have no ISRC.

Heads-up given #830: that PR moved `0201` away from `CONCURRENTLY`
because the legacy Python Celery task on `challenge_disbursements` was
keeping ~3-minute transactions open continuously, blocking the build's
`virtualxid` wait indefinitely. `tracks` is written by the Go indexer
and isn't subject to that long-transaction pattern, so `CONCURRENTLY`
should complete normally here. If a stuck build needs to be aborted,
drop the invalid index with `DROP INDEX IF EXISTS
tracks_isrc_normalized_idx;` and re-run.

## ISWC

There is no `iswc` query-param lookup today (column exists and is
exposed in API responses, but no handler or sqlc query reads it).
Nothing to mirror — left as-is.

## Test plan

- [x] `TestGetTracksByISRC` covers: stored-with-dashes queried undashed,
queried with same dashes, queried lowercased; stored-without-dashes
queried as-is and queried with inserted dashes.
- [x] Existing tracks tests (`TestTracksEndpoint`,
`TestGetTracksByPermalink`, `TestGetTracksExcludesAccessAuthorities`,
`Test200UnAuthed`) still pass.
- [x] Migration applies cleanly and is idempotent (re-running prints
`relation \"tracks_isrc_normalized_idx\" already exists, skipping`).
- [x] `EXPLAIN` on a seeded scratch table shows `Bitmap Index Scan on
tracks_isrc_normalized_idx`.
- [ ] After deploy: verify `SELECT indexrelid::regclass, indisvalid FROM
pg_index WHERE indexrelid::regclass::text =
'tracks_isrc_normalized_idx';` returns `indisvalid = true`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant