-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add group path to series index mapping in root attributes #157
Add group path to series index mapping in root attributes #157
Conversation
Fixes glencoesoftware#126. The root level attributes (if they are written) will now contain a "groups" dictionary that maps each group with multiscales metadata to the corresponding Bio-Formats series index (== Image index in METADATA.ome.xml).
This is absolutely amazing! |
Thanks @melissalinkert for starting this discussion and coming up with a few proposal. Capturing a few immediate thoughts:
Based on the above, an alternative approach would be to extend the existing |
Our use cases aren't specific to HCS, so I'd prefer to have a more generic solution than just extending the I don't think it makes sense to add more metadata under this new attribute, so a simple dictionary should be enough. If we need more image or well-level metadata, then those specs should be updated accordingly. In terms of mismatches, I think what is in
...that likely means a corresponding update to raw2ometiff. If this were written up as part of the spec, it probably means adding a bunch of requirements to the effect of One thought for handling the JSON size issue is to move this out of root |
Given the current hard mapping between Bio-Formats series and the path to the group which contains the
and for non-plate data:
This does what it's supposed to do and no more. |
If extensibility is unlikely to be a requirement, given that the series are always indexed from 0 to n, a simple list as proposed in #157 (comment) would certainly do the job.
Interesting. At least from my perspective, this would feel like a good compromise allowing to store this metadata without inflating the top-level |
beb41fa and 508c9bb together should address #157 (comment) and move the new attribute to |
Using the latest commit, I successfully converted a public IDR plate with 384 wells, 30 fields of view i.e. ~10K images. I uploaded the output to a temporary public bucket https://uk1s3.embassy.ebi.ac.uk/bf2raw_157/7361.zarr in case the Glencoe team already has some tools where to test the series mapping proposal. |
👍 thanks @sbesson! |
Very sorry for my long silence on this! 😷 My highest level comment is whether or not there's interest in trying to move this discussion to ome/ngff as a new spec. To me this feels very much like the first steps towards the non-transitional (permanent?) solution for ome/ngff#104
It might be worth a round of discussions on it, but it seems like the feeling is generally "everything can't be in one JSON file". One thing I might add though is that at the top level we might want to "register" this location. (See example below)
In my first quick attempts at this, I had also sided on the use of an array. I was almost sure I had posted this somewhere, but not being able to find it I'll repeat myself here: {
"@type": "GenericContainer",
"metadataSources": [
{
"@type": "OMEXMLMetadata",
"file": {
"path": "/OME/METADATA.ome.xml"
},
"seriesMapping": [
"s0", "s1", "s2"
]
}
]
} |
Documents the current state of glencoesoftware/bioformats2raw#157
https://github.com/melissalinkert/ngff/commits/group-list (on top of ome/ngff#112) is a start at describing what this PR does now. |
Documents the current state of glencoesoftware/bioformats2raw#157
Used this current branch to build 2 samples which have been uploaded to uk1s3 for testing. (See ome/ome-ngff-validator#13 for URLs) Edit: current mainline version of ZarrReader opens both fine with:
cc: @dgault |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am planning to give ome/ngff#112 another round of review today. My understanding is that the next step there will be to port this addition to the 0.4 specification document and update https://ngff.openmicroscopy.org/0.4/ to clarify the expectations and conventions of this layout for data consumers
My vote is to get this change merged and move towards the release bioformats2raw 0.5.0
. Once the upstream OME-NGFF specification is updated, the README can also be amended to include a reference to the relevant section.
Fixes #126.
The root level attributes (if they are written) will now contain a
groups
dictionary that maps each group with multiscales metadata to the corresponding Bio-Formats series index (== Image index inMETADATA.ome.xml
).As noted in the comments, the paths are dictionary keys and the integer indexes are the values. If the indexes were keys, they would be stored as strings, so this seems easier for a reader. Example tiny plate:
Happy to hear better ideas overall (especially for the attribute name), this is just a place to start.
/cc @kkoz @erindiel @DavidStirling @joshmoore @dgault @sbesson @chris-allan (and feel free to add anyone I missed)