Skip to content

Skip full-grid k0 memory term for dask kriging templates#2926

Merged
brendancol merged 1 commit into
mainfrom
deep-sweep-performance-interpolate-kriging-2026-06-04
Jun 4, 2026
Merged

Skip full-grid k0 memory term for dask kriging templates#2926
brendancol merged 1 commit into
mainfrom
deep-sweep-performance-interpolate-kriging-2026-06-04

Conversation

@brendancol

Copy link
Copy Markdown
Contributor

Closes #2923

What changed

  • _check_kriging_memory now takes an is_dask flag. When the template is dask-backed, it drops the prediction-matrix (k0) term from the estimate, since map_blocks builds k0 one chunk at a time and peak per-task memory scales with the chunk, not the full grid.
  • The point-based terms (variogram pair arrays and the kriging matrix) are materialized on the host regardless of backend, so they still apply on the dask path.
  • kriging() detects a dask-backed template via isinstance(template.data, da.Array) and passes the flag through.

Why

Before this, a large chunked dask template raised a spurious MemoryError for the exact case the dask backend exists to handle. A 4000x4000 template with 200x200 chunks claimed it needed ~8.1 GB when each chunk only needs ~20 MB.

Backends

numpy guard behaviour unchanged. dask+numpy and dask+cupy no longer hit the full-grid k0 estimate. cupy (non-dask) unchanged.

Test plan

  • test_dask_template_skips_grid_memory_guard: a 4000x4000 / 200x200 dask template runs under a 64 MB cap
  • test_check_helper_dask_skips_k0_term: the helper drops the k0 term when is_dask=True
  • test_check_helper_dask_still_guards_matrix: the point-based matrix guard still fires on the dask path
  • Existing numpy memory-guard tests still pass (k0 term unchanged for numpy)
  • Full test_interpolation.py green (44 passed, includes cupy and dask+cupy parity)

The kriging() memory guard estimated the prediction matrix (k0) from the
full grid size, even for dask-backed templates where map_blocks builds k0
one chunk at a time. A large chunked template raised a spurious MemoryError
for the case the dask backend exists to handle. Drop the k0 term when the
template is dask-backed; the point-based variogram-pair and matrix terms
still apply.

Also records the interpolate-kriging performance sweep state.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label Jun 4, 2026

@brendancol brendancol left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Skip full-grid k0 memory term for dask kriging templates

Blockers

None.

Suggestions

None.

Nits

  • _kriging.py _check_kriging_memory: with is_dask=True the k0 term is dropped unconditionally. A dask array with a single chunk covering the whole grid still allocates the full k0 in that one chunk, so the guard under-estimates that case. It is unusual to hold a full-grid dask template in one chunk, and the old full-grid estimate was wrong for the normal chunked case, so dropping the term is the right tradeoff here. If you want to tighten it later, scale k0 to the largest chunk via template.data.chunks.

What looks good

  • The change is small: one keyword arg and a per-backend choice for a single term. The variogram-pair and matrix terms still apply on the dask path, which is correct because both are built on the host regardless of backend.
  • is_dask detection checks da is not None before the isinstance, so it is safe with dask uninstalled, and it covers dask+cupy since that backend also wraps a dask.array.Array.
  • Tests cover the helper (k0 skipped, matrix still guarded), an end-to-end 4000x4000 / 200x200 dask template under a 64 MB cap, and the unchanged numpy path.

Checklist

  • Algorithm matches reference: n/a (memory guard, not a numerical kernel)
  • All implemented backends consistent: yes, dask+numpy and dask+cupy both covered
  • NaN handling: unchanged
  • Edge cases covered by tests: yes
  • Dask chunk boundaries: n/a
  • No premature materialization: confirmed, no .values/.compute added
  • Benchmark: not needed (guard logic)
  • README matrix: n/a (no new function)
  • Docstrings: updated to document is_dask behavior

@brendancol brendancol merged commit 0b5303b into main Jun 4, 2026
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kriging() memory guard raises spurious MemoryError on dask-backed templates

1 participant