fix(sqlite_writer): skip index cells that exceed a single page by jjoos · Pull Request #303 · DeusData/codebase-memory-mcp

jjoos · 2026-04-30T11:16:20Z

Problem

write_index_btree() crashes with SIGBUS when indexing large repositories (observed on a Ruby codebase with ~215K nodes, specifically the HackerOne core monolith with ~67K functions).

The crash occurs in the page-buffer writer when an index cell's payload (qualified_name + file_path + properties) exceeds the SQLite page size. The code flushes the current page and assumes the fresh page will always accept the cell — but oversized cells still fail pb_cell_fits() after the flush. The subsequent pb_add_cell() causes a content_offset underflow, corrupting the page header and triggering SIGBUS on the next write.

Reproduction

Index any repository with functions whose qualified names + file paths exceed ~4000 bytes combined (the SQLite page size minus overhead). A Ruby monolith with deeply nested namespaces and long file paths reliably triggers this.

codebase-memory-mcp index --path /path/to/large-ruby-repo
# → SIGBUS in write_index_btree at sqlite_writer.c:1218

Fix

After pb_promote_and_flush(), re-check pb_cell_fits(). If the cell still doesn't fit on an empty page, continue past it. Index entries whose keys exceed a full page cannot be stored in a leaf and are silently dropped — the rest of the index is correctly preserved.

This is the index-btree counterpart to PR #175 (record overflow pages), which fixed the same shape of bug for the record-btree writer. The record side uses overflow pages; the index side has no overflow mechanism, so skipping is the correct behavior.

Changes

internal/cbm/sqlite_writer.c: 10 lines added, 3 removed in write_index_btree().

Testing

Verified by indexing a 215K-node Ruby repo that previously crashed — completes successfully with all other nodes indexed.
The skipped cells are edge cases (extremely long qualified names) that don't affect query correctness for normal usage.

Closes the index-cell side of Stack overflow crash in autoindex_thread → cbm_pipeline_pass_configlink (v0.5.6, macOS ARM64) #139 / Direct SQLite writer crashes on oversized cells during indexing (repro with netty on macOS arm64) #187 (record-cell side was fixed in fix: implement overflow pages in sqlite_writer to prevent SIGBUS on large records (#139) #175).
PR fix(extraction): replace fixed traversal stacks with growable arena-allocated stacks #217 (growable AST traversal stacks) addresses a different SIGBUS trigger in the parser.

write_index_btree() flushed the page buffer whenever the next cell didn't fit, on the assumption that a freshly-initialised page would always accept any cell. That invariant fails for index cells whose payload (qualified_name, file_path, etc.) is larger than a full SQLite page — after the flush, pb_cell_fits() is still false but the code calls pb_add_cell() anyway. The subsequent content_offset underflow corrupts the page header and the next write triggers SIGBUS on large repos (observed on a Ruby codebase with ~215K nodes). Re-check pb_cell_fits() after the flush and continue past cells that still don't fit. Index entries whose keys exceed a full page can't be stored and are not expected to survive the writer anyway — the rest of the index is correctly preserved and the indexer no longer crashes. Record overflow pages (DeusData#175) already covers the record-btree side of this same shape of bug.

DeusData added bug Something isn't working stability/performance Server crashes, OOM, hangs, high CPU/memory labels May 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sqlite_writer): skip index cells that exceed a single page#303

fix(sqlite_writer): skip index cells that exceed a single page#303
jjoos wants to merge 1 commit intoDeusData:mainfrom
jjoos:fix/sqlite-index-btree-overflow

jjoos commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jjoos commented Apr 30, 2026

Problem

Reproduction

Fix

Changes

Testing

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants