Skip to content

Use list_objects_v2 API for S3#302

Merged
pjbull merged 21 commits intomasterfrom
s3-v2-api
Dec 18, 2022
Merged

Use list_objects_v2 API for S3#302
pjbull merged 21 commits intomasterfrom
s3-v2-api

Conversation

@pjbull
Copy link
Copy Markdown
Member

@pjbull pjbull commented Dec 18, 2022

This is primarily a fix for #155, but it should simplify and stabilize listing dirs from S3 in a way that makes a number of the other directory handling tasks easier (like #295, #181, #176, and #51).

One of the concerns is that this change may affect performance since we change both the recursive and non-recursive apis, so this PR includes an initial perf test suite for listing directories. See results below (TLDR: negligible change if any).

Bonus fixes:

  • Fix Smoke tests for non-all installations #171 (approved code would have introduced an error here, so worth it to fix)
  • Perf test suite
  • test-debug command stops debugger at failing tests from previous run
  • Check existence of buckets/containers when permitted by permissions
  • Add pytest-xdist to requirements, which will run the test suite in parallel and results in decent speed up. Comparison of latest master run and this branch shaves off ~10 minutes (makes sense since gh action machines have 2 cores):

image

Closes #151
Closes #171
Closes #291

Perf from same machine

Before changes

                                 Performance suite results: (2022-12-17T16:11:32.734517)
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Test Name    ┃ Config Name                ┃ Iterations ┃           Mean ┃              Std ┃            Max ┃ N Items ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ List Folders │ List shallow recursive     │         10 │ 0:00:01.425719 │ ± 0:00:00.172902 │ 0:00:01.788437 │   5,500 │
│ List Folders │ List shallow non-recursive │         10 │ 0:00:01.630402 │ ± 0:00:00.187300 │ 0:00:01.882856 │   5,500 │
│ List Folders │ List normal recursive      │         10 │ 0:00:02.168442 │ ± 0:00:00.039756 │ 0:00:02.219833 │   7,877 │
│ List Folders │ List normal non-recursive  │         10 │ 0:00:00.039067 │ ± 0:00:00.010252 │ 0:00:00.067796 │     113 │
│ List Folders │ List deep recursive        │         10 │ 0:00:03.237524 │ ± 0:00:00.060835 │ 0:00:03.317934 │   7,955 │
│ List Folders │ List deep non-recursive    │         10 │ 0:00:00.030883 │ ± 0:00:00.001670 │ 0:00:00.033317 │      31 │
└──────────────┴────────────────────────────┴────────────┴────────────────┴──────────────────┴────────────────┴─────────┘

With new changes

                                 Performance suite results: (2022-12-17T17:11:50.937390)
┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Test Name    ┃ Config Name                ┃ Iterations ┃           Mean ┃              Std ┃            Max ┃ N Items ┃
┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ List Folders │ List shallow recursive     │         10 │ 0:00:01.293559 │ ± 0:00:00.027168 │ 0:00:01.336273 │   5,500 │
│ List Folders │ List shallow non-recursive │         10 │ 0:00:01.289645 │ ± 0:00:00.037398 │ 0:00:01.348425 │   5,500 │
│ List Folders │ List normal recursive      │         10 │ 0:00:01.874748 │ ± 0:00:00.079183 │ 0:00:02.006050 │   7,877 │
│ List Folders │ List normal non-recursive  │         10 │ 0:00:00.033232 │ ± 0:00:00.004852 │ 0:00:00.045547 │     113 │
│ List Folders │ List deep recursive        │         10 │ 0:00:03.193576 │ ± 0:00:00.180428 │ 0:00:03.455747 │   7,955 │
│ List Folders │ List deep non-recursive    │         10 │ 0:00:00.030362 │ ± 0:00:00.001615 │ 0:00:00.032398 │      31 │
└──────────────┴────────────────────────────┴────────────┴────────────────┴──────────────────┴────────────────┴─────────┘

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Dec 18, 2022

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Dec 18, 2022

Codecov Report

Merging #302 (5a950bf) into master (289baeb) will decrease coverage by 0.5%.
The diff coverage is 91.4%.

Additional details and impacted files
@@           Coverage Diff            @@
##           master    #302     +/-   ##
========================================
- Coverage    94.9%   94.4%   -0.6%     
========================================
  Files          21      21             
  Lines        1341    1363     +22     
========================================
+ Hits         1273    1287     +14     
- Misses         68      76      +8     
Impacted Files Coverage Δ
cloudpathlib/s3/s3client.py 91.3% <88.0%> (-4.3%) ⬇️
cloudpathlib/azure/azblobclient.py 94.4% <100.0%> (+0.1%) ⬆️
cloudpathlib/client.py 87.9% <100.0%> (ø)
cloudpathlib/gs/gsclient.py 93.2% <100.0%> (-1.4%) ⬇️

Copy link
Copy Markdown
Member

@jayqi jayqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. One comment typo, and a few minor comments.

Comment thread Makefile Outdated
Comment thread tests/mock_clients/mock_azureblob.py Outdated
Comment thread tests/performance/cli.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

exists should work on a bucket Smoke tests for non-all installations ENH: Allow .resolve() with CloudPaths

3 participants