Skip to content

Commit 3745f35

Browse files
connortsui20claude
andcommitted
[claude] feat(bench): add --gh-json-v3 emitter and post-ingest script (#7638)
## Summary Implements the alpha **emitter** component for `bench.vortex.dev` v3, per [`benchmarks-website/planning/components/emitter.md`](https://github.com/vortex-data/vortex/blob/ct/benchmarks-v3/benchmarks-website/planning/components/emitter.md). **Purely additive** to v2's emission path — the existing `-d gh-json -o ...` form is untouched. ### Rust emitter (`vortex-bench`) - New `vortex-bench/src/v3.rs` module with one record type per `kind` (`query_measurement`, `compression_time`, `compression_size`, `random_access_time`, `vector_search_run`) plus serde-tagged `V3Record` enum. Field shapes match [`02-contracts.md`](https://github.com/vortex-data/vortex/blob/ct/benchmarks-v3/benchmarks-website/planning/02-contracts.md); dataset/variant/scale-factor mapping follows [`benchmark-mapping.md`](https://github.com/vortex-data/vortex/blob/ct/benchmarks-v3/benchmarks-website/planning/benchmark-mapping.md). - Each benchmark binary gains a `--gh-json-v3 <PATH>` flag that writes bare records as JSONL (no envelope), alongside the legacy `--display-format gh-json -o ...` flow: - `compress-bench` — `compression_time` (encode/decode) + `compression_size`. Cross-format ratios are **not** emitted; ratios are computed read-side per `decisions.md`. - `datafusion-bench`, `duckdb-bench`, `lance-bench` — `query_measurement`, with optional memory fields populated when `--track-memory` is on. `QueryMeasurement` and the paired `MemoryMeasurement` collapse into one record (`SqlBenchmarkRunner::v3_records`). - `random-access-bench` — `random_access_time`, with the dataset name plumbed alongside `TimingMeasurement`. - `vector-search-bench` — `vector_search_run`, with `dataset`, `layout`, `threshold`, `iterations` plumbed in (they don't live on `ScanTiming`). - `insta` snapshot tests cover one record per `kind`, scrubbing `commit_sha` and `env_triple`. ### Post-ingest script `scripts/post-ingest.py` (Python 3, stdlib only — `urllib`, `json`, `subprocess`): - reads JSONL of records, - fills the `commit` envelope from `git show` for the SHA passed in, - wraps in `{run_meta, commit, records}` per the contract, - POSTs to `<server>/api/ingest` with `Authorization: Bearer ...` from `INGEST_BEARER_TOKEN`, - exits non-zero on 4xx/5xx. **No retries, no spool, no S3 outbox** — deferred per the alpha plan. ### Out of scope (deferred) CI workflow integration, dual-write, `bench-orchestrator` updates, retry/spool/outbox, replacing the v2 CLI form. All listed in [`deferred.md`](https://github.com/vortex-data/vortex/blob/ct/benchmarks-v3/benchmarks-website/planning/deferred.md). ## Test plan - [x] `cargo test -p vortex-bench --lib` — 48 passed (7 new `v3` tests, one snapshot per kind plus a JSONL round-trip). - [x] `cargo build -p vortex-bench -p compress-bench -p datafusion-bench -p duckdb-bench -p lance-bench -p random-access-bench -p vector-search-bench` — all clean. - [x] `cargo clippy --all-targets` on changed crates (skipping `duckdb-bench`, blocked by an unrelated pre-existing `cognitive_complexity` lint in `vortex-duckdb` on `ct/benchmarks-v3`). - [x] `cargo +nightly fmt --all`. - [x] End-to-end smoke: `scripts/post-ingest.py` against a Python `http.server` mock — 200 → exit 0 with `{"inserted":1,"updated":0}`; 400 → exit 1 with the server body on stderr. - [ ] Real round-trip against an actual alpha server — blocked on the server component landing (acceptance criterion 3 in the emitter plan; verifiable once the server PR exists). https://claude.ai/code/session_017qh4ju4FtkizW6s67JEhPW --- _Generated by [Claude Code](https://claude.ai/code/session_017qh4ju4FtkizW6s67JEhPW)_ --------- Signed-off-by: Claude <noreply@anthropic.com> Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Connor Tsui <connor.tsui20@gmail.com>
1 parent 1e3e252 commit 3745f35

23 files changed

Lines changed: 1085 additions & 4 deletions

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

benchmarks/compress-bench/src/main.rs

Lines changed: 50 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,7 @@ use vortex_bench::public_bi::PBIDataset::Euro2016;
4141
use vortex_bench::public_bi::PBIDataset::Food;
4242
use vortex_bench::public_bi::PBIDataset::HashTags;
4343
use vortex_bench::setup_logging_and_tracing_with_format;
44+
use vortex_bench::v3;
4445

4546
#[derive(Parser, Debug)]
4647
#[command(version, about, long_about = None)]
@@ -68,6 +69,10 @@ struct Args {
6869
display_format: DisplayFormat,
6970
#[arg(short, long)]
7071
output_path: Option<PathBuf>,
72+
/// Additionally write v3 JSONL records to this path. See
73+
/// `benchmarks-website/planning/02-contracts.md`.
74+
#[arg(long)]
75+
gh_json_v3: Option<PathBuf>,
7176
#[arg(long)]
7277
tracing: bool,
7378
/// Format for the primary stderr log sink. `text` is the default human-readable format;
@@ -89,6 +94,7 @@ async fn main() -> anyhow::Result<()> {
8994
args.ops,
9095
args.display_format,
9196
args.output_path,
97+
args.gh_json_v3,
9298
)
9399
.await
94100
}
@@ -114,6 +120,7 @@ async fn run_compress(
114120
ops: Vec<CompressOp>,
115121
display_format: DisplayFormat,
116122
output_path: Option<PathBuf>,
123+
gh_json_v3: Option<PathBuf>,
117124
) -> anyhow::Result<()> {
118125
let targets = formats
119126
.iter()
@@ -163,17 +170,24 @@ async fn run_compress(
163170
let progress = ProgressBar::new((datasets.len() * formats.len() * ops.len()) as u64);
164171

165172
let mut measurements = vec![];
173+
let mut v3_records: Vec<v3::V3Record> = Vec::new();
166174

167175
for dataset_handle in datasets.into_iter() {
168-
let m = run_benchmark_for_dataset(&progress, &formats, &ops, iterations, dataset_handle)
169-
.await?;
176+
let (m, mut records) =
177+
run_benchmark_for_dataset(&progress, &formats, &ops, iterations, dataset_handle)
178+
.await?;
170179
measurements.push(m);
180+
v3_records.append(&mut records);
171181
}
172182

173183
let measurements = CompressMeasurements::from_iter(measurements);
174184

175185
progress.finish();
176186

187+
if let Some(path) = gh_json_v3 {
188+
v3::write_jsonl_to_path(&path, &v3_records)?;
189+
}
190+
177191
let mut writer = create_output_writer(&display_format, output_path, BENCHMARK_ID)?;
178192

179193
match display_format {
@@ -202,8 +216,9 @@ async fn run_benchmark_for_dataset(
202216
ops: &[CompressOp],
203217
iterations: usize,
204218
dataset_handle: &dyn Dataset,
205-
) -> anyhow::Result<CompressMeasurements> {
219+
) -> anyhow::Result<(CompressMeasurements, Vec<v3::V3Record>)> {
206220
let bench_name = dataset_handle.name();
221+
let (v3_dataset, v3_variant) = dataset_handle.v3_dataset_dims();
207222
tracing::info!("Running {bench_name} benchmark");
208223

209224
// Get the parquet file path for this dataset
@@ -213,6 +228,7 @@ async fn run_benchmark_for_dataset(
213228
let mut timings = Vec::new();
214229
let mut measurements_map: HashMap<(Format, CompressOp), Duration> = HashMap::new();
215230
let mut compressed_sizes: HashMap<Format, u64> = HashMap::new();
231+
let mut v3_records: Vec<v3::V3Record> = Vec::new();
216232

217233
for format in formats {
218234
let compressor = get_compressor(*format);
@@ -228,6 +244,24 @@ async fn run_benchmark_for_dataset(
228244
)
229245
.await?;
230246
compressed_sizes.insert(*format, result.compressed_size);
247+
let all_runs_ns: Vec<u64> = result
248+
.all_runs
249+
.iter()
250+
.map(|d| u64::try_from(d.as_nanos()).unwrap_or(u64::MAX))
251+
.collect();
252+
v3_records.push(v3::compression_time_record(
253+
&result.timing,
254+
v3_dataset,
255+
v3_variant,
256+
CompressOp::Compress,
257+
all_runs_ns,
258+
));
259+
v3_records.push(v3::compression_size_record(
260+
v3_dataset,
261+
v3_variant,
262+
*format,
263+
result.compressed_size,
264+
));
231265
ratios.extend(result.ratios);
232266
timings.push(result.timing);
233267
result.time
@@ -240,6 +274,18 @@ async fn run_benchmark_for_dataset(
240274
bench_name,
241275
)
242276
.await?;
277+
let all_runs_ns: Vec<u64> = result
278+
.all_runs
279+
.iter()
280+
.map(|d| u64::try_from(d.as_nanos()).unwrap_or(u64::MAX))
281+
.collect();
282+
v3_records.push(v3::compression_time_record(
283+
&result.timing,
284+
v3_dataset,
285+
v3_variant,
286+
CompressOp::Decompress,
287+
all_runs_ns,
288+
));
243289
timings.push(result.timing);
244290
result.time
245291
}
@@ -258,5 +304,5 @@ async fn run_benchmark_for_dataset(
258304
&mut ratios,
259305
);
260306

261-
Ok(CompressMeasurements { timings, ratios })
307+
Ok((CompressMeasurements { timings, ratios }, v3_records))
262308
}

benchmarks/datafusion-bench/src/main.rs

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ use vortex_bench::runner::BenchmarkQueryResult;
4444
use vortex_bench::runner::SqlBenchmarkRunner;
4545
use vortex_bench::runner::filter_queries;
4646
use vortex_bench::setup_logging_and_tracing;
47+
use vortex_bench::v3;
4748
use vortex_datafusion::metrics::VortexMetricsFinder;
4849

4950
/// Common arguments shared across benchmarks
@@ -82,6 +83,11 @@ struct Args {
8283
#[arg(short)]
8384
output_path: Option<PathBuf>,
8485

86+
/// Additionally write v3 JSONL records to this path. See
87+
/// `benchmarks-website/planning/02-contracts.md`.
88+
#[arg(long)]
89+
gh_json_v3: Option<PathBuf>,
90+
8591
#[arg(long, default_value_t = false)]
8692
show_metrics: bool,
8793

@@ -226,6 +232,10 @@ async fn main() -> anyhow::Result<()> {
226232
print_metrics(plans.as_ref());
227233
}
228234

235+
if let Some(path) = args.gh_json_v3.as_ref() {
236+
v3::write_jsonl_to_path(path, &runner.v3_records())?;
237+
}
238+
229239
let benchmark_id = format!("datafusion-{}", benchmark.dataset_name());
230240
let writer = create_output_writer(&args.display_format, args.output_path, &benchmark_id)?;
231241
runner.export_to(&args.display_format, writer)?;

benchmarks/duckdb-bench/src/main.rs

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ use vortex_bench::runner::BenchmarkMode;
2424
use vortex_bench::runner::SqlBenchmarkRunner;
2525
use vortex_bench::runner::filter_queries;
2626
use vortex_bench::setup_logging_and_tracing;
27+
use vortex_bench::v3;
2728

2829
/// Common arguments shared across benchmarks
2930
#[derive(Parser)]
@@ -58,6 +59,11 @@ struct Args {
5859
#[arg(short)]
5960
output_path: Option<PathBuf>,
6061

62+
/// Additionally write v3 JSONL records to this path. See
63+
/// `benchmarks-website/planning/02-contracts.md`.
64+
#[arg(long)]
65+
gh_json_v3: Option<PathBuf>,
66+
6167
#[arg(long, default_value_t = false)]
6268
track_memory: bool,
6369

@@ -190,6 +196,10 @@ fn main() -> anyhow::Result<()> {
190196
)?;
191197

192198
if !args.explain {
199+
if let Some(path) = args.gh_json_v3.as_ref() {
200+
v3::write_jsonl_to_path(path, &runner.v3_records())?;
201+
}
202+
193203
let benchmark_id = format!("duckdb-{}", benchmark.dataset_name());
194204
let writer = create_output_writer(&args.display_format, args.output_path, &benchmark_id)?;
195205
runner.export_to(&args.display_format, writer)?;

benchmarks/lance-bench/src/main.rs

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ use vortex_bench::runner::BenchmarkQueryResult;
2828
use vortex_bench::runner::SqlBenchmarkRunner;
2929
use vortex_bench::runner::filter_queries;
3030
use vortex_bench::setup_logging_and_tracing;
31+
use vortex_bench::v3;
3132

3233
/// Lance benchmark tool - runs SQL queries against Lance format data using DataFusion
3334
#[derive(Parser)]
@@ -59,6 +60,11 @@ struct Args {
5960
#[arg(short)]
6061
output_path: Option<PathBuf>,
6162

63+
/// Additionally write v3 JSONL records to this path. See
64+
/// `benchmarks-website/planning/02-contracts.md`.
65+
#[arg(long)]
66+
gh_json_v3: Option<PathBuf>,
67+
6268
#[arg(long, default_value_t = false)]
6369
hide_progress_bar: bool,
6470

@@ -124,6 +130,10 @@ async fn main() -> anyhow::Result<()> {
124130
)
125131
.await?;
126132

133+
if let Some(path) = args.gh_json_v3.as_ref() {
134+
v3::write_jsonl_to_path(path, &runner.v3_records())?;
135+
}
136+
127137
let benchmark_id = format!("lance-{}", benchmark.dataset_name());
128138
let writer = create_output_writer(&args.display_format, args.output_path, &benchmark_id)?;
129139
runner.export_to(&args.display_format, writer)?;

benchmarks/random-access-bench/src/main.rs

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@ use vortex_bench::random_access::RandomAccessor;
3232
use vortex_bench::random_access::VortexRandomAccessor;
3333
use vortex_bench::setup_logging_and_tracing;
3434
use vortex_bench::utils::constants::STORAGE_NVME;
35+
use vortex_bench::v3;
3536

3637
// ---------------------------------------------------------------------------
3738
// Access patterns
@@ -173,6 +174,10 @@ struct Args {
173174
display_format: DisplayFormat,
174175
#[arg(short)]
175176
output_path: Option<PathBuf>,
177+
/// Additionally write v3 JSONL records to this path. See
178+
/// `benchmarks-website/planning/02-contracts.md`.
179+
#[arg(long)]
180+
gh_json_v3: Option<PathBuf>,
176181
/// Which datasets to benchmark random access on.
177182
#[arg(
178183
long,
@@ -205,6 +210,7 @@ async fn main() -> Result<()> {
205210
args.open_mode,
206211
args.display_format,
207212
args.output_path,
213+
args.gh_json_v3,
208214
)
209215
.await
210216
}
@@ -340,6 +346,7 @@ async fn run_random_access(
340346
open_mode: OpenMode,
341347
display_format: DisplayFormat,
342348
output_path: Option<PathBuf>,
349+
gh_json_v3: Option<PathBuf>,
343350
) -> Result<()> {
344351
let reopen_variants: &[bool] = match open_mode {
345352
OpenMode::Cached => &[false],
@@ -358,6 +365,7 @@ async fn run_random_access(
358365

359366
let mut targets = Vec::new();
360367
let mut measurements = Vec::new();
368+
let mut v3_records: Vec<v3::V3Record> = Vec::new();
361369

362370
for dataset in datasets {
363371
for format in &formats {
@@ -380,6 +388,7 @@ async fn run_random_access(
380388
)
381389
.await?;
382390

391+
v3_records.push(v3::random_access_record(&measurement, dataset.name()));
383392
targets.push(measurement.target);
384393
measurements.push(measurement);
385394
progress.inc(1);
@@ -406,6 +415,7 @@ async fn run_random_access(
406415
)
407416
.await?;
408417

418+
v3_records.push(v3::random_access_record(&measurement, dataset.name()));
409419
targets.push(measurement.target);
410420
measurements.push(measurement);
411421
progress.inc(1);
@@ -416,6 +426,10 @@ async fn run_random_access(
416426

417427
progress.finish();
418428

429+
if let Some(path) = gh_json_v3 {
430+
v3::write_jsonl_to_path(&path, &v3_records)?;
431+
}
432+
419433
let mut writer = create_output_writer(&display_format, output_path, BENCHMARK_ID)?;
420434

421435
match display_format {

benchmarks/vector-search-bench/src/main.rs

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ use vector_search_bench::scan::ScanConfig;
2828
use vector_search_bench::scan::ScanTiming;
2929
use vector_search_bench::scan::run_search_scan;
3030
use vortex_bench::setup_logging_and_tracing;
31+
use vortex_bench::v3;
3132
use vortex_bench::vector_dataset;
3233
use vortex_bench::vector_dataset::TrainLayout;
3334
use vortex_bench::vector_dataset::VectorDataset;
@@ -71,6 +72,11 @@ struct Args {
7172
#[arg(long)]
7273
output_path: Option<PathBuf>,
7374

75+
/// Additionally write v3 JSONL records to this path. See
76+
/// `benchmarks-website/planning/02-contracts.md`.
77+
#[arg(long)]
78+
gh_json_v3: Option<PathBuf>,
79+
7480
/// Emit verbose tracing.
7581
#[arg(short, long)]
7682
verbose: bool,
@@ -143,6 +149,36 @@ async fn main() -> Result<()> {
143149
vortex_results: &pairs,
144150
};
145151

152+
// Emit v3 JSONL if requested. The records carry the per-scan dimensions that
153+
// ScanTiming itself does not (dataset, layout, threshold).
154+
if let Some(path) = args.gh_json_v3.as_ref() {
155+
let records: Vec<v3::V3Record> = scan_timings
156+
.iter()
157+
.map(|scan| {
158+
let all_runs_ns: Vec<u64> = scan
159+
.all_runs
160+
.iter()
161+
.map(|d| u64::try_from(d.as_nanos()).unwrap_or(u64::MAX))
162+
.collect();
163+
let median_ns = u64::try_from(scan.median.as_nanos()).unwrap_or(u64::MAX);
164+
v3::vector_search_record(
165+
v3::VectorSearchDims {
166+
dataset: dataset.name(),
167+
layout: layout.label(),
168+
flavor: scan.flavor.label(),
169+
threshold: f64::from(args.threshold),
170+
},
171+
median_ns,
172+
all_runs_ns,
173+
scan.matches,
174+
scan.rows_scanned,
175+
scan.bytes_scanned,
176+
)
177+
})
178+
.collect();
179+
v3::write_jsonl_to_path(path, &records)?;
180+
}
181+
146182
// Print the results.
147183
if let Some(path) = args.output_path {
148184
let mut file =

0 commit comments

Comments
 (0)