[numba.md] Update np.random → Generator API by Chihiro2000GitHub · Pull Request #550 · QuantEcon/lecture-python-programming

Chihiro2000GitHub · 2026-05-11T22:34:54Z

Summary

This PR migrates legacy NumPy random API usage in numba.md as part of QuantEcon/meta#299.

Details

The Numba/JIT-related classification and changes in this PR follow the guidance at https://manual.quantecon.org/styleguide/code.html#numpy-random-number-generation. I would be grateful if you could refer to the Numba section of this page when reviewing.

Case A (update(), main text, @jit): Left unchanged. Although update() is @jit-decorated, it is called from a prange loop in compute_long_run_median_parallel, making it unsafe to pass a shared Generator. Flagged for reviewer judgment.

Case B (speed_ex1 solution, @jit / no parallel): rng = np.random.default_rng() placed before the @jit definition; signature changed to calculate_pi(rng, n=...); np.random.uniform → rng.uniform. Call sites updated. Note: passing a Generator into a @jit function may require Numba to use object mode. I checked that the updated code runs as expected on my side, but I would appreciate reviewer confirmation.

Case C (speed_ex2 solution, plain Python / later JIT-compiled via jit(compute_series)): Same pattern as Case B. Signature changed to compute_series(n, rng); np.random.uniform → rng.uniform. Both the pure Python and jit-compiled call sites updated.

Case D (numba_ex3 solution, @jit(parallel=True) + prange): Draws lifted outside the prange loop. rng, u_draws, and v_draws defined before the @jit(parallel=True) definition; signature changed to calculate_pi(u_draws, v_draws); loop length derived from len(u_draws). n = 1_000_000 kept to match the original example size.

Case E (numba_ex4 solution, @jit(parallel=True) + prange): Legacy np.random.randn() calls kept intentionally as a narrow memory-constrained exception. Pre-allocating (M, n) shock arrays with M = 10_000_000 and n = 20 would require approximately 3.2 GB, which is likely beyond what most readers' machines can comfortably accommodate. A short comment has been added inside the loop explaining this.

Hi @mmcky and @HumphreyYang, I'd be grateful if you could take a look when you have time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-05-11T22:51:38Z

🚀 Deployed on https://6a0aa7ac7ef596d0eff8f0f4--epic-agnesi-957267.netlify.app

HumphreyYang

Many thanks @Chihiro2000GitHub! It looks good to me. Just one small comment!

HumphreyYang · 2026-05-13T06:03:33Z

+n = 1_000_000
+rng = np.random.default_rng()
+u_draws = rng.uniform(size=n)
+v_draws = rng.uniform(size=n)
+
 @jit(parallel=True)
-def calculate_pi(n=1_000_000):
+def calculate_pi(u_draws, v_draws):
+    n = len(u_draws)
    count = 0
    for i in prange(n):
-        u, v = np.random.uniform(0, 1), np.random.uniform(0, 1)
+        u, v = u_draws[i], v_draws[i]
        d = np.sqrt((u - 0.5)**2 + (v - 0.5)**2)
        if d < 0.5:
            count += 1


I think this avoids the thread-safety issue, but the timing comparison is no longer quite apples-to-apples because the random draws have been moved outside the timed function.

Personally, I think we can generate the draws in the first solution by moving

n = 1_000_000 rng = np.random.default_rng() u_draws = rng.uniform(size=n) v_draws = rng.uniform(size=n)

to line 457, and modifying the function in the code block starting at line 457 (solution to exercise 1) to take u_draws and v_draws, like the function here.

In this case, we would have a fair timing comparison.

Please let me know what you think!

Hi @mmcky, I think this is an example where we need to make a tough choice between keeping the legacy API for simplicity and using the new rng API with some trade-offs.

Thanks @HumphreyYang I agree with you. I have added some additional notes below regarding double checking nopython mode as well.

mmcky · 2026-05-14T09:55:09Z

Thanks @Chihiro2000GitHub for the careful classification and write-up. As @HumphreyYang says most of the PR looks good. A few things to resolve before merging:

1. Generator passed into `@jit` (Case B, `speed_ex1`)

I'd like to confirm that this stays in nopython mode before we merge. Numba's support for numpy.random.Generator objects is limited and version-dependent — "runs" isn't the same as "stays in nopython mode," and if it silently falls back, the lecture's claim immediately below the code cell breaks:

If we switch off JIT compilation by removing @jit, the code takes around 150 times as long on our machine.

Could you re-run with @jit(nopython=True) (which raises rather than falls back) and report the with-JIT vs. without-JIT timing? If the 150x figure no longer holds, we should change approach.

2. Apples-to-apples timing (Case D vs Case B) — agree with @HumphreyYang

Humphrey's point on numba_ex3 is right: lifting the draws out of the timed function in Case D but leaving them inside in Case B makes the parallel-vs-serial comparison unfair. I'd suggest applying the "lift draws outside" pattern to speed_ex1 as well.

Two benefits:

The timing comparison across the two exercises becomes consistent.
It sidesteps issue 1 entirely — the JITed function only sees plain arrays, no Generator object crossing the JIT boundary.

This also gives readers a clean, consistent rule: JIT-compiled hot loops consume pre-drawn arrays.

3. Same concern in Case C (`speed_ex2`)

compute_series_numba = jit(compute_series) JIT-compiles a function that calls rng.uniform(...) internally. Same nopython-mode question as issue 1. Since rng.uniform(0, 1, size=n) is called once to produce the array U, the cleanest fix is symmetric with issue 2: draw U outside and pass it in as an argument. That keeps the JITed body free of Generator objects.

4. Minor — comment placement in Case E

The comment inside the inner loop explaining why draws are kept inline would read more naturally as a one-liner above the for t in range(n): loop, or in the prose just before the code cell. Not blocking.

Summary of requested changes: confirm nopython mode for Cases B and C (or refactor them to take pre-drawn arrays, which I'd prefer), and adopt the same pre-drawn-array pattern for speed_ex1 so the timing comparison with numba_ex3 is fair. Cases A and E are good as-is.

Thanks again!

Also this is definitely the hardest one I think :-) Thanks for iterating on this.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Chihiro2000GitHub · 2026-05-18T05:39:06Z

Thanks for the detailed review, @mmcky and @HumphreyYang! Here is a summary of the changes made and a timing verification.

Changes to `numba.md`

Case B (speed_ex1) — Issues 1 & 2: Pre-draw u_draws and v_draws outside the @jit function and pass them as arguments, matching the pattern already used in numba_ex3. The rng.Generator object no longer crosses the JIT boundary, and the timing comparison with numba_ex3 is now apples-to-apples.
Case C (speed_ex2) — Issue 3: Draw U outside compute_series and pass it as an argument, keeping the JIT-compiled body free of Generator objects.
Case E (numba_ex4) — Issue 4: Moved the inline comment about pre-allocation trade-offs to the prose just before the code cell.

Timing verification (`@jit(nopython=True)`)

I chose the refactor approach, but also ran @jit(nopython=True) as requested in Issue 1 to confirm nopython mode and check the timing:

Click to expand

"""
Temporary test: verify refactored Cases B and C in numba.md.
- Case B: @jit(nopython=True) timing, old API vs new API
- Case C: @jit(nopython=True) timing, new API only
"""
import numpy as np
import quantecon as qe
from numba import jit

# ---------------------------------------------------------------------------
# Case B: speed_ex1  (pi approximation)
# ---------------------------------------------------------------------------
print("=== Case B: calculate_pi ===")

# --- Old API ---
@jit(nopython=True)
def calculate_pi_old_jit(n=1_000_000):
    count = 0
    for i in range(n):
        u, v = np.random.uniform(0, 1), np.random.uniform(0, 1)
        d = np.sqrt((u - 0.5)**2 + (v - 0.5)**2)
        if d < 0.5:
            count += 1
    area_estimate = count / n
    return area_estimate * 4

def calculate_pi_old_pure(n=1_000_000):
    count = 0
    for i in range(n):
        u, v = np.random.uniform(0, 1), np.random.uniform(0, 1)
        d = np.sqrt((u - 0.5)**2 + (v - 0.5)**2)
        if d < 0.5:
            count += 1
    area_estimate = count / n
    return area_estimate * 4

print("  [Old API] Warm-up:")
with qe.Timer():
    calculate_pi_old_jit()

print("  [Old API] Timed @jit(nopython=True):")
with qe.Timer() as t_old_jit:
    calculate_pi_old_jit()

print("  [Old API] Timed pure Python:")
with qe.Timer() as t_old_pure:
    calculate_pi_old_pure()

print(f"  Old API speedup: {t_old_pure.elapsed / t_old_jit.elapsed:.1f}x")

# --- New API ---
n = 1_000_000
rng = np.random.default_rng()
u_draws = rng.uniform(size=n)
v_draws = rng.uniform(size=n)

@jit(nopython=True)
def calculate_pi_new_jit(u_draws, v_draws):
    n = len(u_draws)
    count = 0
    for i in range(n):
        u, v = u_draws[i], v_draws[i]
        d = np.sqrt((u - 0.5)**2 + (v - 0.5)**2)
        if d < 0.5:
            count += 1
    area_estimate = count / n
    return area_estimate * 4

def calculate_pi_new_pure(u_draws, v_draws):
    n = len(u_draws)
    count = 0
    for i in range(n):
        u, v = u_draws[i], v_draws[i]
        d = np.sqrt((u - 0.5)**2 + (v - 0.5)**2)
        if d < 0.5:
            count += 1
    area_estimate = count / n
    return area_estimate * 4

print("  [New API] Warm-up:")
with qe.Timer():
    calculate_pi_new_jit(u_draws, v_draws)

print("  [New API] Timed @jit(nopython=True):")
with qe.Timer() as t_new_jit:
    calculate_pi_new_jit(u_draws, v_draws)

print("  [New API] Timed pure Python:")
with qe.Timer() as t_new_pure:
    calculate_pi_new_pure(u_draws, v_draws)

print(f"  New API speedup: {t_new_pure.elapsed / t_new_jit.elapsed:.1f}x")

# ---------------------------------------------------------------------------
# Case C: speed_ex2  (Markov chain simulation)
# ---------------------------------------------------------------------------
print("\n=== Case C: compute_series ===")

p, q = 0.1, 0.2
n = 1_000_000
rng = np.random.default_rng()
U = rng.uniform(0, 1, size=n)

def compute_series(n, U):
    x = np.empty(n, dtype=np.int64)
    x[0] = 1
    for t in range(1, n):
        current_x = x[t-1]
        if current_x == 0:
            x[t] = U[t] < p
        else:
            x[t] = U[t] > q
    return x

x = compute_series(n, U)
print(np.mean(x == 0))

with qe.Timer() as t_c_pure:
    compute_series(n, U)

print(f"  Pure Python time: {t_c_pure.elapsed:.4f}s")

compute_series_numba = jit(compute_series, nopython=True)

x = compute_series_numba(n, U)
print(np.mean(x == 0))

with qe.Timer() as t_c_jit:
    compute_series_numba(n, U)

print(f"  JIT time:         {t_c_jit.elapsed:.4f}s")

Results on my machine (speed_ex1 / Case B):

	Speedup
Old API (`np.random.uniform` inside loop), `@jit(nopython=True)` vs pure	~150x
New API (pre-drawn arrays), `@jit(nopython=True)` vs pure	~400–700x (variable across runs)

Both Cases B and C compiled successfully under @jit(nopython=True) — no object-mode fallback.

Question about "~150 times"

The old API gives a speedup consistent with the existing "around 150 times" claim. However, after the refactor (pre-drawn arrays), the speedup is considerably larger, so the current text no longer holds. Would you like me to update that line? Options I see:

Replace with "significantly faster"
Replace with "hundreds of times faster"
Something else you prefer

You can also reproduce these results by running the code above.

One more question: `numba_ex3` timing claim

The solution to numba_ex3 states: "On our workstation, we find that parallelization increases execution speed by a factor of 2 or 3." Since this PR also modified the numba_ex3 solution to use pre-drawn arrays, should we re-verify this figure under the new code as well?

Update rng usage in numba.md

397a891

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Chihiro2000GitHub requested review from HumphreyYang and mmcky May 11, 2026 22:34

github-actions Bot temporarily deployed to pull request May 11, 2026 22:51 Inactive

HumphreyYang reviewed May 13, 2026

View reviewed changes

Address reviewer feedback on rng usage in numba.md

aef82bd

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions Bot deployed to pull request May 18, 2026 05:46 View deployment

Chihiro2000GitHub mentioned this pull request May 18, 2026

Update NumPy random seed usage → recommended Generator approach across lecture repos QuantEcon/meta#299

Open

40 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[numba.md] Update np.random → Generator API#550

[numba.md] Update np.random → Generator API#550
Chihiro2000GitHub wants to merge 2 commits into
mainfrom
update-rng-numba

Chihiro2000GitHub commented May 11, 2026

Uh oh!

github-actions Bot commented May 11, 2026 •

edited

Loading

Uh oh!

HumphreyYang left a comment •

edited

Loading

Uh oh!

HumphreyYang May 13, 2026

Uh oh!

HumphreyYang May 13, 2026

Uh oh!

mmcky May 14, 2026

Uh oh!

mmcky commented May 14, 2026

Uh oh!

Chihiro2000GitHub commented May 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Chihiro2000GitHub commented May 11, 2026

Summary

Details

Uh oh!

github-actions Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HumphreyYang left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HumphreyYang May 13, 2026

Choose a reason for hiding this comment

Uh oh!

HumphreyYang May 13, 2026

Choose a reason for hiding this comment

Uh oh!

mmcky May 14, 2026

Choose a reason for hiding this comment

Uh oh!

mmcky commented May 14, 2026

1. Generator passed into @jit (Case B, speed_ex1)

2. Apples-to-apples timing (Case D vs Case B) — agree with @HumphreyYang

3. Same concern in Case C (speed_ex2)

4. Minor — comment placement in Case E

Uh oh!

Chihiro2000GitHub commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes to numba.md

Timing verification (@jit(nopython=True))

Question about "~150 times"

One more question: numba_ex3 timing claim

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented May 11, 2026 •

edited

Loading

HumphreyYang left a comment •

edited

Loading

1. Generator passed into `@jit` (Case B, `speed_ex1`)

3. Same concern in Case C (`speed_ex2`)

Chihiro2000GitHub commented May 18, 2026 •

edited

Loading

Changes to `numba.md`

Timing verification (`@jit(nopython=True)`)

One more question: `numba_ex3` timing claim