Skip to content

perf: bump benchmark iterations from 5 to 30 and wire up workflow input#1870

Merged
Mossaka merged 1 commit intomainfrom
feat/1864-bump-iterations
Apr 9, 2026
Merged

perf: bump benchmark iterations from 5 to 30 and wire up workflow input#1870
Mossaka merged 1 commit intomainfrom
feat/1864-bump-iterations

Conversation

@Mossaka
Copy link
Copy Markdown
Collaborator

@Mossaka Mossaka commented Apr 9, 2026

Summary

  • Bump default benchmark iterations from 5 to 30 for statistically meaningful percentiles (P95 = 2nd-worst of 30, not just max of 5)
  • Wire up the existing iterations workflow dispatch input via AWF_BENCHMARK_ITERATIONS env var — previously declared but ignored
  • Increase workflow timeout from 30 to 90 minutes to accommodate more iterations

Test plan

  • npm run build passes
  • npx jest benchmark-utils passes (26 tests)
  • Trigger workflow manually with iterations: 5 to verify env var is read
  • Verify default scheduled run uses 30 iterations

Closes #1864

🤖 Generated with Claude Code

With only 5 iterations, P95/P99/max are all the same value — no
statistical significance in percentile calculations. Bumping to 30
gives meaningful tail-latency data (P95 = 2nd-worst of 30 runs).

Changes:
- Read iteration count from AWF_BENCHMARK_ITERATIONS env var (default 30)
- Pass workflow_dispatch iterations input to the script via env var
- Increase workflow timeout from 30 to 90 minutes for more iterations

Closes #1864

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 9, 2026 22:52
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 9, 2026

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 85.85% 85.95% 📈 +0.10%
Statements 85.76% 85.85% 📈 +0.09%
Functions 87.54% 87.54% ➡️ +0.00%
Branches 78.56% 78.61% 📈 +0.05%
📁 Per-file Coverage Changes (1 files)
File Lines (Before → After) Statements (Before → After)
src/docker-manager.ts 86.3% → 86.6% (+0.36%) 85.9% → 86.2% (+0.35%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the performance benchmarking defaults and GitHub Actions wiring so scheduled/manual runs collect enough samples for meaningful percentile reporting, and so the workflow’s iterations input actually affects the benchmark script.

Changes:

  • Change benchmark script default iterations from 5 to 30, read from AWF_BENCHMARK_ITERATIONS.
  • Update workflow dispatch input default to 30 and pass it to the script via env var.
  • Increase workflow timeout to accommodate the higher iteration count.
Show a summary per file
File Description
scripts/ci/benchmark-performance.ts Reads benchmark iteration count from AWF_BENCHMARK_ITERATIONS (default 30).
.github/workflows/performance-monitor.yml Sets workflow default iterations to 30, exports env var for the script, and increases timeout.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

// ── Configuration ──────────────────────────────────────────────────

const ITERATIONS = 5;
const ITERATIONS = parseInt(process.env.AWF_BENCHMARK_ITERATIONS || '30', 10);
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ITERATIONS is derived directly from AWF_BENCHMARK_ITERATIONS via parseInt(...) with no validation. If the env var is non-numeric (NaN) or <= 0, several benchmarks will produce an empty values array and then stats(values) will throw (stats() requires at least one value). Consider validating/clamping to a positive integer and falling back to the default (e.g., 30) with a clear stderr warning when the input is invalid.

Suggested change
const ITERATIONS = parseInt(process.env.AWF_BENCHMARK_ITERATIONS || '30', 10);
const DEFAULT_ITERATIONS = 30;
function getIterations(): number {
const raw = process.env.AWF_BENCHMARK_ITERATIONS;
if (raw === undefined) {
return DEFAULT_ITERATIONS;
}
const parsed = Number.parseInt(raw, 10);
if (!Number.isInteger(parsed) || parsed <= 0) {
console.error(
`Invalid AWF_BENCHMARK_ITERATIONS=${JSON.stringify(raw)}; using default ${DEFAULT_ITERATIONS}.`
);
return DEFAULT_ITERATIONS;
}
return parsed;
}
const ITERATIONS = getIterations();

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 9, 2026

Smoke Test Results

✅ GitHub MCP — PRs: "feat: enable cli-proxy for smoke-services and firewall-issue-dispatcher", "chore: upgrade gh-aw to v0.67.4 and disable secret-digger schedules"
✅ Playwright — github.com title contains "GitHub"
✅ File Write — /tmp/gh-aw/agent/smoke-test-claude-24217244196.txt created
✅ Bash — file content verified

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude

@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 9, 2026

🔮 The ancient spirits stir in the circuit of this run.
The oracle has marked missing rites (discussion-query and tavily paths),
but the forge and mirrors answered true where tools were present.
The smoke sigils remain recorded above for judgment.

🔮 The oracle has spoken through Smoke Codex

@Mossaka Mossaka enabled auto-merge (squash) April 9, 2026 23:01
@Mossaka Mossaka disabled auto-merge April 9, 2026 23:01
@Mossaka Mossaka merged commit 46dbac3 into main Apr 9, 2026
58 of 62 checks passed
@Mossaka Mossaka deleted the feat/1864-bump-iterations branch April 9, 2026 23:01
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 9, 2026

🏗️ Build Test Suite Results

Ecosystem Project Build/Install Tests Status
Bun elysia 1/1 passed ✅ PASS
Bun hono 1/1 passed ✅ PASS
C++ fmt N/A ✅ PASS
C++ json N/A ✅ PASS
Deno oak N/A 1/1 passed ✅ PASS
Deno std N/A 1/1 passed ✅ PASS
.NET hello-world N/A ✅ PASS
.NET json-parse N/A ✅ PASS
Go color passed ✅ PASS
Go env passed ✅ PASS
Go uuid passed ✅ PASS
Java gson 1/1 passed ✅ PASS
Java caffeine 1/1 passed ✅ PASS
Node.js clsx passed ✅ PASS
Node.js execa passed ✅ PASS
Node.js p-limit passed ✅ PASS
Rust fd 1/1 passed ✅ PASS
Rust zoxide 1/1 passed ✅ PASS

Overall: 8/8 ecosystems passed — ✅ PASS

Generated by Build Test Suite for issue #1870 · ● 445.8K ·

Mossaka added a commit that referenced this pull request Apr 9, 2026
If AWF_BENCHMARK_ITERATIONS is non-numeric (NaN) or <= 0, the benchmark
script would produce an empty values array causing stats() to throw.
Validate and clamp to a positive integer, falling back to the default
(30) with a clear stderr warning when the input is invalid.

Follow-up to #1870 addressing Copilot review feedback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mossaka added a commit that referenced this pull request Apr 9, 2026
If AWF_BENCHMARK_ITERATIONS is non-numeric (NaN) or <= 0, the benchmark
script would produce an empty values array causing stats() to throw.
Validate and clamp to a positive integer, falling back to the default
(30) with a clear stderr warning when the input is invalid.

Follow-up to #1870 addressing Copilot review feedback.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bump benchmark iterations from 5 to 30 and wire up workflow input

2 participants