-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache Dask arrays created from NetCDFDataProxy
s to speed up loading files with multiple variables
#6252
Conversation
916a1df
to
c61b12f
Compare
c61b12f
to
1249c6b
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6252 +/- ##
==========================================
+ Coverage 89.85% 89.88% +0.02%
==========================================
Files 88 88
Lines 23401 23430 +29
Branches 4357 4361 +4
==========================================
+ Hits 21028 21059 +31
+ Misses 1646 1644 -2
Partials 727 727 ☔ View full report in Codecov by Sentry. |
⏱️ Performance Benchmark Report: 953e8f9Performance shifts
Full benchmark results
Generated by GHA run |
The benchmarks showing changes aren't really the ones I'd expect. |
⏱️ Performance Benchmark Report: 953e8f9Performance shifts
Full benchmark results
Generated by GHA run |
I added a benchmark in bfbd625 that should show the improvement. |
💯 it will be great to see this come together ! |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
⏱️ Performance Benchmark Report: ad1e4f1Performance shifts
Full benchmark results
Generated by GHA run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bouweandela, looks really good to me for the most part!
Only one suggestion, and I'm happy for you to oppose that. Other than that happy for this to be merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy with this, thank you!
🚀 Pull Request
Description
Another idea to speed up loading NetCDF files with many variables. This caches the last 100 Dask arrays created from
NetCDFDataProxy
s so shared coordinates can be re-used. Since copying a Dask array is much faster than creating a new one, this gives a speedup.Consult Iris pull request check list
Add any of the below labels to trigger actions on this PR: