-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Planning explicit dependency on Zarr v3 #392
Comments
I'd also be glad to contribute to this effort. I'd like to confirm the goals:
If I'm correct in both being goals, I think we'll also need to figure out what to do with the kerchunk writer since that's based on zarr_format=2. |
Thanks for raising this @sharkinsspatial, I was planning to raise an issue like this myself.
I think that your bullets roughly correspond to the PRs we need, but we should actually do the steps in the opposite order listed.
The dependency on kerchunk can be re-introduced at any point, once a v3-compatible release of kerchunk becomes available. However, I want to have this all done by the time Icechunk releases 1.0 (~late Feb), which means we can't really wait around for kerchunk. Once we have stability in the dependencies and in the external API we can work on refactors at our leisure:
That should put us on a solid ground for future feature additions, with everything passing with released versions of all dependencies, and a clearer separation of responsibilities within the test suite. Any issues with the HDF reader can be tackled as necessary. It would be great to have another meta-issue to track those @sharkinsspatial. But I think that none of those fixes are actually pre-requisites to this plan, as there are currently no tests that pass with kerchunk's HDF5 reader but fail with your HDF reader. @maxrjones yes we want to only support
Zarr-python v3 supports reading from v2-compliant stores. But Icechunk only obeys the zarr v3 spec, so it's a non-issue for icechunk virtual stores. For kerchunk that's a good point - right now a v3-compliant kerchunk store I guess is undefined, as kerchunk itself does not support v3. So the kerchunk writer will have to raise if Whether or not we should support writing to |
FYI I created a project board because I was having difficulties remembering the various dependency conflicts (especially with the alpha icechunk releases). The README will track the dependencies for Kerchunk, Icechunk, and VirtualiZarr, and the phases are broken down according to Tom's outline above. Here's the project board - https://github.com/orgs/zarr-developers/projects/6. I'd need a organization owner to make it public but hopefully everyone has access to the Zarr-Python org to see the board🤞 |
Thanks @maxrjones! @joshmoore I don't believe I have the necessary permissions to make this project public either - would you mind awfully either doing that (or better yet giving me such permissions)? |
👍 on figuring out the permissions. Pinged on zulip for a chat. |
Update: GitHub permissions require There appear to be a number of workarounds for pieces of this (like this app to add people to the org) We set the project permission together and double checked team membership. 👍 |
We have a few interdependent issues which discuss the next steps for completely migrating to Zarr v3 but I thought it may be helpful to outline a rough plan of attack which consolidates all of these issues in one spot so everyone can contribute to the discussion on how to achieve this. I'll list things sequentially, but several of these may need to be approached in a single PR due to dependency conflicts.
HDFVirtualBackend
. Along with the previous steps, this would remove the bulk of our dependence on kerchunk and free us to depend on Zarr v3 explicitly with only a limited loss of existing functionality. There are several outstanding issues (and likely more) with theHDFVirtualBackend
that were discovered and raised during the Pangeo hack day in December that will need to be addressed. I'll link to a separate issue with the plan for tackling these.dataset->kerchunk reader->virtualizarr kerchunk reader
approach until dedicated Virtualizarr readers can be developed for these formats.@TomNicholas It would be great to get your feedback on how you think these steps should be organized into PRs so that we can make manageable changes but still execute the necessary parts of the test suite. @abarciauskas-bgse and I have some availability now to start tackling #17 and I'm going to begin working to stabilize
HDFVirtualBackend
so that it is hopefully robust enough for the majority of current use cases.The text was updated successfully, but these errors were encountered: