Skip to content

Commit 02a3556

Browse files
Technologicatclaude
andcommitted
README: add Performance section with the lapackdrivers timing plot
The lapackdrivers benchmark example was already producing a timing figure (`figure1_latest.pdf`) but it lived in nobody's mental model: no README reference, file name suggesting "draft," cwd-relative output path so it landed wherever the user happened to invoke the script. This commit gives the figure a proper home: - examples/lapackdrivers_example.py now saves to project root via a `__file__`-anchored path (regardless of where the example was invoked from), and produces both `lapack_timings.png` (150 dpi, bbox tight, README artifact) and `lapack_timings.pdf` (vector quality, for printing or reuse). The historical `figure1_latest.pdf` name is gone. - lapack_timings.png is committed at the project root as the canonical README image. The .pdf sibling is gitignored — it is just a regenerable build product. The legacy `figure1_latest.pdf` name is also gitignored, in case stale copies linger in working trees. - README.md gains a "## Performance" section between Features and Examples, embedding the PNG with a two-paragraph caption that explains what the lines mean (parallel batched LAPACK drivers vs. a Python loop over numpy.linalg.solve, log–log scale), what the takeaway is (most savings come from staying inside nogil Cython for the loop and from OpenMP across independent problems), and how to regenerate the figure on the reader's own machine. Also a short cross-reference from the Speed bullet under Features pointing at the new section. The figure is reproducible across runs and machines now that the example seeds the legacy global RNG with `np.random.seed(42)` (committed in the previous patch). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent aa2b906 commit 02a3556

4 files changed

Lines changed: 38 additions & 1 deletion

File tree

.gitignore

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,12 @@ pdm.lock
2222
wlsqm/fitter/*.html
2323
wlsqm/utils/*.html
2424
.claude/
25+
26+
# Regenerable artifact from examples/lapackdrivers_example.py.
27+
# The PNG sibling IS committed (it's referenced from README.md), but the
28+
# PDF is just for vector-quality printing and would just churn whenever
29+
# someone reruns the example.
30+
lapack_timings.pdf
31+
# Old name produced by versions of the example before v1.0; kept for
32+
# back-compat in case anyone has stale copies in their working tree.
33+
figure1_latest.pdf

README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ Full derivation of the generalized version (including the case of unknown functi
7272
- **Speed.**
7373
- Performance-critical code is in Cython with the GIL released.
7474
- LAPACK is called directly via [SciPy's Cython-level bindings](https://docs.scipy.org/doc/scipy/reference/linalg.cython_lapack.html); no GIL round-trip for the solver loop.
75+
- For asymptotic timings of the parallel batched LAPACK drivers vs. a Python loop over `numpy.linalg.solve`, see the figure under "Performance" below.
7576
- **Accuracy.**
7677
- Problem matrices are preconditioned by a symmetry-preserving iterative scaling (Ruiz, 2001) before LU factorization, which is critical for high-order fits.
7778
- Reference: Daniel Ruiz. 2001. *A Scaling Algorithm to Equilibrate Both Rows and Columns Norms in Matrices*. Report RAL-TR-2001-034.
@@ -81,6 +82,23 @@ Full derivation of the generalized version (including the case of unknown functi
8182
- Optional iterative refinement inside the solver mitigates roundoff further.
8283

8384

85+
## Performance
86+
87+
![Average time per problem instance, parallel LAPACK drivers vs. a Python loop over numpy.linalg.solve, log–log scale](lapack_timings.png)
88+
89+
Average time per problem instance for the parallel LAPACK drivers in `wlsqm.utils.lapackdrivers`, on a synthetic batch of independent symmetric and general linear systems of varying matrix size `n`. Generated by [`examples/lapackdrivers_example.py`](examples/lapackdrivers_example.py) (seeded RNG, deterministic). Both axes are log scale.
90+
91+
The takeaway is that the batched parallel drivers (red and green lines) cost a small fraction of what a Python loop calling `numpy.linalg.solve` (black line) costs per instance — most of the savings come from staying inside `nogil` Cython for the loop and from OpenMP parallelism over independent problems. The factorize-once-then-solve pair (green) is essentially as fast as the single-shot solver (red) when the batch is solved exactly once, and pays off when the same factorization is reused against many right-hand sides.
92+
93+
To regenerate the figure on your own machine:
94+
95+
```bash
96+
python examples/lapackdrivers_example.py
97+
```
98+
99+
The script writes both `lapack_timings.png` and `lapack_timings.pdf` to the project root.
100+
101+
84102
## Examples
85103

86104
Minimal example using a manufactured solution to produce the input data: fit `f(x,y) = 1 + 2x + 3y + 4xy + 5x² + 6y²` on a scattered point cloud centered at the origin.

examples/lapackdrivers_example.py

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,20 @@
66
"""
77

88

9+
import os
910
import time
1011

1112
import numpy as np
1213
from numpy.linalg import solve as numpy_solve # for comparison purposes
1314

1415
import matplotlib.pyplot as plt
1516

17+
# Where to write the timing-plot output files. Anchor to the project root
18+
# (one level up from this script) so the PNG and PDF land in a stable,
19+
# version-controllable location regardless of where the example was
20+
# invoked from. README.md embeds `lapack_timings.png` from here.
21+
PROJECT_ROOT = os.path.abspath(os.path.join(os.path.dirname(os.path.abspath(__file__)), os.pardir))
22+
1623
try:
1724
import wlsqm.utils.lapackdrivers as drivers
1825
except ImportError:
@@ -337,7 +344,10 @@ def main():
337344
plt.grid(visible=True, which='both')
338345
plt.legend(loc='best')
339346

340-
plt.savefig('figure1_latest.pdf')
347+
# Save both formats: PNG for embedding in README.md, PDF for printing
348+
# / vector-quality reuse. Both go to the project root.
349+
plt.savefig(os.path.join(PROJECT_ROOT, 'lapack_timings.png'), dpi=150, bbox_inches='tight')
350+
plt.savefig(os.path.join(PROJECT_ROOT, 'lapack_timings.pdf'), bbox_inches='tight')
341351

342352

343353
if __name__ == '__main__':

lapack_timings.png

110 KB
Loading

0 commit comments

Comments
 (0)