You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[claude] feat(bench): add --gh-json-v3 emitter and post-ingest script (#7638)
## Summary
Implements the alpha **emitter** component for `bench.vortex.dev` v3,
per
[`benchmarks-website/planning/components/emitter.md`](https://github.com/vortex-data/vortex/blob/ct/benchmarks-v3/benchmarks-website/planning/components/emitter.md).
**Purely additive** to v2's emission path — the existing `-d gh-json -o
...`
form is untouched.
### Rust emitter (`vortex-bench`)
- New `vortex-bench/src/v3.rs` module with one record type per `kind`
(`query_measurement`, `compression_time`, `compression_size`,
`random_access_time`, `vector_search_run`) plus serde-tagged
`V3Record` enum. Field shapes match
[`02-contracts.md`](https://github.com/vortex-data/vortex/blob/ct/benchmarks-v3/benchmarks-website/planning/02-contracts.md);
dataset/variant/scale-factor mapping follows
[`benchmark-mapping.md`](https://github.com/vortex-data/vortex/blob/ct/benchmarks-v3/benchmarks-website/planning/benchmark-mapping.md).
- Each benchmark binary gains a `--gh-json-v3 <PATH>` flag that writes
bare records as JSONL (no envelope), alongside the legacy
`--display-format gh-json -o ...` flow:
- `compress-bench` — `compression_time` (encode/decode) +
`compression_size`. Cross-format ratios are **not** emitted; ratios
are computed read-side per `decisions.md`.
- `datafusion-bench`, `duckdb-bench`, `lance-bench` —
`query_measurement`, with optional memory fields populated when
`--track-memory` is on. `QueryMeasurement` and the paired
`MemoryMeasurement` collapse into one record
(`SqlBenchmarkRunner::v3_records`).
- `random-access-bench` — `random_access_time`, with the dataset name
plumbed alongside `TimingMeasurement`.
- `vector-search-bench` — `vector_search_run`, with `dataset`, `layout`,
`threshold`, `iterations` plumbed in (they don't live on `ScanTiming`).
- `insta` snapshot tests cover one record per `kind`, scrubbing
`commit_sha` and `env_triple`.
### Post-ingest script
`scripts/post-ingest.py` (Python 3, stdlib only — `urllib`, `json`,
`subprocess`):
- reads JSONL of records,
- fills the `commit` envelope from `git show` for the SHA passed in,
- wraps in `{run_meta, commit, records}` per the contract,
- POSTs to `<server>/api/ingest` with `Authorization: Bearer ...` from
`INGEST_BEARER_TOKEN`,
- exits non-zero on 4xx/5xx. **No retries, no spool, no S3 outbox** —
deferred per the alpha plan.
### Out of scope (deferred)
CI workflow integration, dual-write, `bench-orchestrator` updates,
retry/spool/outbox, replacing the v2 CLI form. All listed in
[`deferred.md`](https://github.com/vortex-data/vortex/blob/ct/benchmarks-v3/benchmarks-website/planning/deferred.md).
## Test plan
- [x] `cargo test -p vortex-bench --lib` — 48 passed (7 new `v3` tests,
one snapshot per kind plus a JSONL round-trip).
- [x] `cargo build -p vortex-bench -p compress-bench -p datafusion-bench
-p duckdb-bench -p lance-bench -p random-access-bench -p
vector-search-bench` — all clean.
- [x] `cargo clippy --all-targets` on changed crates (skipping
`duckdb-bench`, blocked by an unrelated pre-existing
`cognitive_complexity` lint in `vortex-duckdb` on `ct/benchmarks-v3`).
- [x] `cargo +nightly fmt --all`.
- [x] End-to-end smoke: `scripts/post-ingest.py` against a Python
`http.server` mock — 200 → exit 0 with `{"inserted":1,"updated":0}`; 400
→ exit 1 with the server body on stderr.
- [ ] Real round-trip against an actual alpha server — blocked on the
server component landing (acceptance criterion 3 in the emitter plan;
verifiable once the server PR exists).
https://claude.ai/code/session_017qh4ju4FtkizW6s67JEhPW
---
_Generated by [Claude
Code](https://claude.ai/code/session_017qh4ju4FtkizW6s67JEhPW)_
---------
Signed-off-by: Claude <noreply@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
0 commit comments