-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize well organization in high-content screening: field of view => image #137
base: main
Are you sure you want to change the base?
Conversation
Automated Review URLs |
Thanks for that. "Field of views of the microscope may be saved as individual images in each |
|
Thanks @will-moore
Sounds great, I shortened it that way
Thanks for the confirmation. In that case, I guess it needs to remain being called |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thx 👍
@will-moore Just checking in: What is the process or timeline to get this change into the OME-NGFF spec? Is there a chance it will be part of the 0.5 spec? Do I need to talk to some people or convince someone else first that this would be a good idea? |
I would expect this to be included in v0.5 spec, especially since it's more like advice than a change in spec. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No objection from my side. We might want whoever will be driving the 0.5 roadmap to also quickly sign-off.
From a naming perspective:
- the usage of
images
increases the consistency with the terminology used in the well specification - from the closest equivalent model, a WellSample in the OME model is defined as an image captured within a well
Regarding the discussion between alternative layouts and their suitability for different application contexts, I do not have a better suggestion than the note. Two comments:
1- this discussion applies outside the context of HCS data i.e. storing unstitched vs stitched images,
2- other decisions have similar trade-offs (chunking size, chunk dimensions, resolution granularity).
I anticipate the information about these trade-offs might be reworked as the specification evolves.
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/faim-hcs-functions-to-work-with-hcs-data/78868/11 |
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: |
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/best-approach-for-appending-to-ome-ngff-datasets/89070/3 |
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: https://forum.image.sc/t/fractal-framework-zarr-compatibility/92536/2 |
I would like to suggest a change to the wording of the OME-NGFF HCS plate specification and add some recommendations about performance for visualization vs. structure of image pyramids per well. Specifically, I propose that we explicitly allow for whole wells being saved as a single image as part of the OME-NGFF spec. As a conclusion of this, the components of the wells would be images, not field of views (because the image could consist of multiple field of views stitched together already).
Motivation
We would like to use OME-Zarr files to store TB-sized multi-channel, 3D high content imaging data in the HCS format. We are building an open-source image processing pipeline to process data in HCS OME-Zarr called Fractal. One of the benefits of saving such large datasets in OME-Zarrs is the possibility of interactive image visualization, e.g. in the napari viewer. When we were testing the scalability of this approach to large HCS plates, we discovered issues with saving all the field of views of the microscope as separate field of views in each well of the OME-Zarr file.
We started the discussion about this topic here: ome/ome-zarr-py#200
The discussion on the approach of saving single images per well starts here in more detail: ome/ome-zarr-py#200 (comment)
To very briefly summarize it:
By saving many field of views (FOVs) per well as separate images with the whole pyramid hierarchy leads to very suboptimal IO challenges. To visualize plates at low resolution, a tiny pyramid file needs to be loaded for each field of view. When a plate has >1000 field of views across all its wells, this becomes very, very slow. Even for a case with just 72 field of views and just 3 pyramid levels, loading was already 8 times slower with the FOVs saved as separate image pyramids vs. a single image pyramid. This seems to be quite a fundamental issue of how fast many small files vs. a single large file can be accessed and would likely get worse when using object storage vs classical file systems. See further details in the issues above
Thus, our solution to this has been to store our wells as a single, fused images for each well. In discussions on this issue, there was an openness to this approach being part of the spec. Thus, I have created this PR to suggest a change that would explicitly allow this and mentions the trade-offs. I hope this PR can be the place to discuss this further and see whether it can make it into the ome-ngff spec.
Open questions
How should we specify the trade-offs? I'm proposing a "Note" here, but open to other implementations. Also, is this specification of Note correct? Does it work for multi-line paragraphs?
Is the explanation of the trade-offs understandable? See here: 20261ac
I think it is important to get away from the field of view naming in the spec when wells can be collections of images. But there are two keys in the plate metadata that contain the name field. How should one proceed with these?
Specifically,
maximumfieldcount
(does it describe max field of views per well? Or in total? ⇒ is the wording of images per well correct? Or would it be images in the whole plate (though then what is “max”, isn’t that just a count)?) andfield_count
(is that per well or per plate? It says “fields per view” ⇒ what is a view?)