pdf2json v4.0.3 Release Notes
Bug Fixes
- Text reading order — Added spatial sort (lib/pdftextsorter.js) to getRawTextContent() so multi-column
and complex-layout PDFs return text in correct top-to-bottom, left-to-right order instead of internal PDF
object order. (#422)
CLI Improvements (#423)
- New --json flag — Emits a structured JSON summary to stdout (version, output file paths, stats, errors,
elapsed time) for programmatic and scripted consumption. - New --quiet flag — Suppresses all non-error output (timer, status messages).
- Granular exit codes — 0 success · 1 parse failure · 2 argument error · 3 I/O error (previously only 0 or
1). - Fixed --singleton / -si flags — Parser instance is now correctly shared at the CLI level; previously
broken. - Directory filter — Only skips dotfiles now; previously silently skipped files starting with -, _, or
whitespace. - 7 internal bug fixes — Eliminated Promise constructor anti-pattern, replaced callback-style
fs.writeFile/fs.readdir with fs.promises, fixed addResultCount type mismatch, removed dead warningCount,
and resolved a TOCTOU race condition in validateParams.
Build & Configuration
- tsconfig.json: Removed dead decorator options; updated moduleResolution/module to node16.
- package.json: Fixed exports map with proper types entries for ESM and CJS TypeScript consumers; removed
unused tslib dependency; added test:coverage script. - rollup.config.js: Enabled tree-shaking for CLI bundle; documented build order dependency.
- CI: Upgraded to actions/checkout@v4; added tsc --noEmit type-check step; bumped Node.js to 22.x.
Tests
- 3 new test suites, 22 new tests — CLI integration (_test_cli.cjs), Stream API (_test_stream.cjs), and
error paths (_test_errors.cjs); all previously had zero coverage. - Total: 74 tests / 7 suites (up from 52 / 4).
- Fixed listener leak in multi-parse test; standardized on Jest expect() over Node assert.
- Renamed _test_getRawTextContent.cjs → _test_sortBidiTexts.cjs to reflect actual coverage.
- Regenerated 37 baseline JSON files to reflect current parser output (baselines were stale since v0.6.8).
Full Changelog