Add competing benchmark workflow: pdu (master) vs dust vs dua by Copilot · Pull Request #340 · KSXGitHub/parallel-disk-usage

Copilot · 2026-03-07T12:00:36Z

Adds a focused workflow_dispatch benchmark workflow comparing only pdu (built from master with --release), dust (latest), and dua (latest) — a trimmed-down variant of the competing_benchmark job in deploy.yaml.

Changes

.github/workflows/competing-benchmark.yaml — Standalone workflow triggered via workflow_dispatch. Builds pdu from the checked-out master branch with --release, dynamically fetches the latest dust and dua releases via GitHub API, runs hyperfine benchmarks, and uploads a report archive as a workflow artifact.
ci/github-actions/competing-benchmark-pdu-dust-dua.ts — Slimmed benchmark matrix covering three categories (apparent-size, block-size, deduplicate-hardlinks) with only dust and dua as competitors (drops ncdu, gdu, du, dutree).

Benchmark Results (Linux 5.12 source tree, ~1.2 GB)

Apparent size

Command	Mean [ms]	Relative
`pdu --quantity=apparent-size`	105.3 ± 4.6	1.00
`dust --no-progress --apparent-size`	191.1 ± 4.1	1.81×
`dua --count-hard-links --apparent-size`	355.9 ± 41.2	3.38×

Block size

Command	Mean [ms]	Relative
`pdu --quantity=block-size`	104.3 ± 2.5	1.00
`dust --no-progress`	202.6 ± 6.4	1.94×
`dua --count-hard-links`	311.6 ± 13.1	2.99×

Deduplicate hardlinks

Command	Mean [ms]	Relative
`pdu --deduplicate-hardlinks`	104.9 ± 2.4	1.00
`dust --no-progress`	202.2 ± 5.2	1.93×
`dua`	307.1 ± 16.2	2.93×

pdu (master) is ~1.9× faster than dust 1.2.4 and ~3× faster than dua 2.34.0 across all categories.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

https://api.github.com/repos/KSXGitHub/parallel-disk-usage/actions/workflows/competing-benchmark.yaml/dispatches
- Triggering command: /usr/bin/curl curl -s -X POST -H Accept: application/vnd.github+json -H Authorization: ****** -d {"ref":"copilot/benchmark-pdu-dust-dua"} -w \nHTTP Status: %{http_code}\n (http block)
https://api.github.com/repos/bootandy/dust/releases/latest
- Triggering command: /usr/bin/curl curl -sf REDACTED (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: KSXGitHub <11488886+KSXGitHub@users.noreply.github.com>

KSXGitHub · 2026-03-07T12:15:43Z

@copilot Fascinating. dust recently has some commits that mentioned "performance" IIRC, but pdu is still 2x times faster. Can you explain why?

Copilot · 2026-03-07T12:17:09Z

@copilot Fascinating. dust recently has some commits that mentioned "performance" IIRC, but pdu is still 2x times faster. Can you explain why?

Great question. The ~2× advantage comes from a few architectural decisions in pdu:

1. Fully recursive Rayon parallelism at every depth

The key is in src/tree_builder.rs:

let children = children
    .into_par_iter()          // ← Rayon parallel iterator
    .map(|name| TreeBuilder { ... })
    .map(Self::from);         // ← recursive, so every subdirectory spawns more parallel work

Every directory's children are dispatched as parallel Rayon tasks, and because Self::from is recursive, the whole tree walk is a work-stealing parallel computation from root to leaves. Idle CPU cores continuously steal pending directory traversals from busy ones, with no bottleneck at any single level.

2. Lock-free concurrent hardlink recording

src/hardlink/hardlink_list.rs uses DashMap (a sharded, lock-free concurrent hashmap) so that parallel threads can record hardlinks without contending on a global mutex.

3. Early nlink short-circuit

In hardlink/aware.rs, files with nlink <= 1 are skipped immediately without any hashmap lookup — the common case for most files in a source tree.

4. symlink_metadata (no symlink following)

pdu always calls symlink_metadata instead of metadata, so it never follows symlinks. This avoids extra stat syscalls and the risk of cycles, reducing I/O in directories that contain many symlinks (common in kernel source trees, e.g., include/ headers).

Dust's recent performance commits likely improved its Rayon usage or reduced allocations, which is why it went from being 3–4× slower in older benchmarks to the ~1.9× slower we see now. But pdu's recursive work-stealing architecture and lock-free DashMap give it a structural edge that incremental optimisations in dust haven't closed yet.

KSXGitHub · 2026-03-07T12:23:35Z

Curiosity satisfied.

Initial plan

89b213d

Copilot AI assigned Copilot and KSXGitHub Mar 7, 2026

Copilot started work on behalf of KSXGitHub March 7, 2026 12:00 View session

Copilot AI and others added 2 commits March 7, 2026 12:04

Add competing benchmark workflow for pdu vs dust vs dua

8a7bf86

Co-authored-by: KSXGitHub <11488886+KSXGitHub@users.noreply.github.com>

Add explicit permissions to competing-benchmark workflow

3e58cbe

Co-authored-by: KSXGitHub <11488886+KSXGitHub@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Update benchmark job to compare specific tools~~ Add competing benchmark workflow: pdu (master) vs dust vs dua Mar 7, 2026

Copilot finished work on behalf of KSXGitHub March 7, 2026 12:11

Copilot started work on behalf of KSXGitHub March 7, 2026 12:15 View session

Copilot finished work on behalf of KSXGitHub March 7, 2026 12:17

KSXGitHub closed this Mar 7, 2026

KSXGitHub deleted the copilot/benchmark-pdu-dust-dua branch March 7, 2026 12:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add competing benchmark workflow: pdu (master) vs dust vs dua#340

Add competing benchmark workflow: pdu (master) vs dust vs dua#340
Copilot wants to merge 3 commits intomasterfrom
copilot/benchmark-pdu-dust-dua

Copilot AI commented Mar 7, 2026 •

edited

Loading

Uh oh!

KSXGitHub commented Mar 7, 2026

Uh oh!

Copilot AI commented Mar 7, 2026

Uh oh!

KSXGitHub commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Copilot AI commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Benchmark Results (Linux 5.12 source tree, ~1.2 GB)

Apparent size

Block size

Deduplicate hardlinks

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

KSXGitHub commented Mar 7, 2026

Uh oh!

Copilot AI commented Mar 7, 2026

Uh oh!

KSXGitHub commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Mar 7, 2026 •

edited

Loading