geotiff: reject malformed SCALE/OFFSET under mask_and_scale#2992
Merged
Conversation
brendancol
commented
Jun 6, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: geotiff: reject malformed SCALE/OFFSET under mask_and_scale
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None.
Nits (optional improvements)
xrspatial/geotiff/_errors.py: the hierarchy tree in the module docstring (Exception -> GeoTIFFAmbiguousMetadataError -> ...) already drops several existing subclasses, so leavingMalformedScaleOffsetErrorout of it stays consistent with what's there. The tree is already stale, so adding the new class is optional cleanup, not a fix.
What looks good
- The change is confined to the one helper the finding named, which keeps the conflict surface small against the sibling PR editing the same function.
- The absent-key vs present-but-malformed split is handled right. The loop only inspects a key when it's actually in the dict and returns the identity default only when no key is present. A malformed dataset-level SCALE raises instead of falling through to the per-band key, which matches the old code: on a parse failure it returned the default, never the per-band value.
- The new error subclasses
GeoTIFFAmbiguousMetadataError/ValueError, so existingexcept ValueErrorcallers keep catching it. Same shape as theInvalidIntegerNodataErrorprecedent for this class of silent-coercion bug. - The message names the offending key and value.
- Both call sites hit
_extract_scale_offsetat graph-assembly time, so the dask path fails closed before compute rather than lazily. The dask test confirms it. - Tests cover eager SCALE, eager OFFSET, dask, and the no-opt-in case, with uniquely named temp files carrying the issue number.
Checklist
- Algorithm matches reference: n/a (validation guard, no numerical algorithm)
- Backends consistent: eager and dask both raise; GPU
mask_and_scaleis already rejected upstream - NaN handling: untouched by this PR
- Edge cases tested: absent key, present-malformed SCALE, present-malformed OFFSET, no opt-in
- Dask chunk boundaries: n/a, the raise is at graph build
- No premature materialization: confirmed, no new array ops
- Benchmark: not needed (error path)
- README feature matrix: n/a (no new function, no backend change)
- Docstrings present and accurate: yes, on the new error class and the helper
brendancol
commented
Jun 6, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review (after nit fix)
The one nit from the first pass is resolved: MalformedScaleOffsetError is now listed in the _errors.py hierarchy docstring tree. I left the other pre-existing subclasses missing from that tree alone, since backfilling them is unrelated to this PR.
No new findings. No blockers, no open suggestions. The change stays confined to the malformed-value handling in _extract_scale_offset plus the new error class and its exports.
This was referenced Jun 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2987
_extract_scale_offsetreturned the 1.0 / 0.0 identity default whenever aSCALEorOFFSETvalue failed to parse as a float, so a malformed item read the same as a missing one. Undermask_and_scale=Truethe caller asked the reader to honour the metadata, and the silent fallback let the raw unscaled pixels through as if the file were clean.Changes:
_extract_scale_offsetnow raises the newMalformedScaleOffsetError(aGeoTIFFAmbiguousMetadataError/ValueErrorsubclass) when a SCALE or OFFSET key is present but unparseable. An absent key still returns the identity default, so a source with no scale/offset reads as before.Both
mask_and_scalecall sites (eager_finalize_eager_readand the dask backend) reach_extract_scale_offsetduring graph assembly, so the rejection fires before compute on both. Reads withoutmask_and_scalenever touch the metadata and are unaffected.Test plan:
test_mask_and_scale_malformed_scale_raises/_offset_raises(eager)test_mask_and_scale_malformed_scale_dask_raises(dask)test_malformed_scale_ignored_without_mask_and_scale(no opt-in, no rejection)test_rioxarray_compat_2961.pysuite still green (29 passed)