Commit 787603e
feat: pass skip_sha256=True to hf_xet for bucket uploads (#3900)
* feat: pass skip_sha256=True to hf_xet for bucket uploads
Bucket uploads don't need SHA-256 in the shard metadata (the sha_index
GSI is only used for LFS pointer resolution, which doesn't apply to
buckets). Pass skip_sha256=True to hf_xet.upload_files() and
upload_bytes() in the bucket upload path to skip the SHA-256
computation, removing the main CPU bottleneck on non-SHA-NI instances.
Depends on: huggingface/xet-core#679
Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
* test: use real bucket upload instead of mocks for skip_sha256 test
Replace the two mock-based tests with a single integration test that:
- Creates a real Bucket on staging Hub
- Uploads files from both filepath and bytes in a single batch
- Wraps (not mocks) hf_xet.upload_files and hf_xet.upload_bytes to
verify skip_sha256=True is passed
- Verifies files are actually uploaded by listing the bucket tree
Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
* test: skip skip_sha256 test when hf_xet doesn't support it yet
The test wraps the real hf_xet functions, so it fails when the
installed hf_xet predates the skip_sha256 parameter (xet-core#679).
Use inspect.signature to detect support and pytest.skip accordingly.
Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
* test: handle built-in functions in skip_sha256 signature check
hf_xet.upload_files is a compiled built-in function, so
inspect.signature() raises ValueError. Catch it and skip the test
when the signature can't be introspected (older hf_xet).
Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
* fix: gracefully fall back when hf_xet lacks skip_sha256 support
Use try/except TypeError around upload_files/upload_bytes calls with
skip_sha256=True, falling back to calls without it for older hf_xet
versions. TypeError for unknown kwargs on compiled functions is raised
before any I/O, so the fallback is safe.
Update test to check call_args_list[0] (the first attempt always
includes skip_sha256=True) instead of requiring the function to
accept it.
Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
* better like this
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Lucain <Wauplin@users.noreply.github.com>1 parent 72871b9 commit 787603e
2 files changed
Lines changed: 37 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12213 | 12213 | | |
12214 | 12214 | | |
12215 | 12215 | | |
| 12216 | + | |
12216 | 12217 | | |
12217 | 12218 | | |
12218 | 12219 | | |
| |||
12229 | 12230 | | |
12230 | 12231 | | |
12231 | 12232 | | |
| 12233 | + | |
12232 | 12234 | | |
12233 | 12235 | | |
12234 | 12236 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
339 | 339 | | |
340 | 340 | | |
341 | 341 | | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
342 | 377 | | |
343 | 378 | | |
344 | 379 | | |
| |||
0 commit comments