Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imaris Reader: support for LZ4 compression and performance improvements #4249

Open
wants to merge 10 commits into
base: develop
Choose a base branch
from

Conversation

marcobitplane
Copy link

Hello,

this pull request adds support for LZ4 compressed ims files and modifies ImarisHDFReader to avoid multiple reads of the same 3D chunks. See also #4217.

LZ4 support is added using NetCDF-Java's ucar.nc2.filter package, which provides a mechanism for user-supplied filters as described here.

Regarding performance, ImarisHDFReader is modified to have a caching mechanism that reads a stack of planes (as many planes as the chunk z-size) from all channels into a buffer, which only needs to be updated after all data in it has been read. If the size of the buffer would exceed 1GB, the reader falls back to reading the requested plane only. The exact performance improvement will depend on the details of the dataset: in our testing, for 3D datasets the new reader can be over an order of magnitude faster than the existing implementation.

Copy link
Member

@melissalinkert melissalinkert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @marcobitplane, this definitely looks like a reasonable approach.

Two higher-level questions before we proceed with more thorough review and testing:

  • lz4-java is added as a dependency, but the existing dependency on aircompressor via ome-codecs should already support LZ4. Is there a reason to use lz4-java specifically, or could aircompressor be used instead?
  • The new LZ4Filter uses the ucar.nc2.filter package; if at all possible, I'd prefer to see this in the existing Bio-Formats package structure (loci.formats.*) rather than ucar.nc2.filter. A brief read of https://docs.unidata.ucar.edu/netcdf-java/5.5/userguide/reading_zarr.html#implementing-a-filter suggests the package name of the custom filter itself doesn't matter, as long as it extends/implements the correct class/interface. Could you please either change the package name, or explain a bit why putting LZ4Filter in ucar.nc2.filter is necessary?

Finally, in order to merge this we would need a signed Contributor License Agreement. That's not urgent at this point, but it would be good to have sooner rather than later.

@marcobitplane
Copy link
Author

Thank you @melissalinkert!

I modified the PR to use aircompressor instead of lz4-java and moved LZ4Filter to loci/formats/filter, let me know if that is not ideal or there's more to change there. I'll send the CLA today.

@melissalinkert
Copy link
Member

Adding to tonight's build, so we should know in the morning if there are any test failures on existing data.

@marcobitplane : maybe I missed this, but I don't see here or in #4217 a reference to lz4 test files. Do you have any Imaris HDF files with lz4 compression that we can use to test this?

@marcobitplane
Copy link
Author

Thank you @melissalinkert, here are two lz4-compressed cropped versions of Imaris demo images (retina and CellDemoMembrane3D). Feel free to let me know if you would prefer to have more images or have them shared in a different way.

@sbesson
Copy link
Member

sbesson commented Nov 14, 2024

@marcobitplane thanks for sharing some sample data. Our preferred way to collect such samples is to upload them to the Bio-Formats Zenodo community collection. For the OME team and the community, has the benefit of unambiguously assigning a license for the distribution and re-usage of these samples. In addition, it is possible to add additional samples to the upload and create new versions of the dataset as necessary during the review process.

I had a quick go at testing these with the proposed changed and the Bio-Formats command-line utility. With the just released 8.0.1 command-line tools, Bio-Formats fails with

sbesson@Sebastiens-MacBook-Pro-3 bioformats % ~/Downloads/bftools/showinf -nopix ~/Downloads/demoImagesLz4/CellDemoMembrane3Dlz4.ims
Checking file format [Bitplane Imaris 5.5 (HDF)]
Initializing reader
ImarisHDFReader initializing /Users/sbesson/Downloads/demoImagesLz4/CellDemoMembrane3Dlz4.ims
Exception in thread "main" java.lang.RuntimeException: Unknown filter type=32004
	at ucar.nc2.iosp.hdf5.H5tiledLayoutBB$DataChunk.getByteBuffer(H5tiledLayoutBB.java:227)
	at ucar.nc2.iosp.LayoutBBTiled.hasNext(LayoutBBTiled.java:101)
	at ucar.nc2.iosp.hdf5.H5tiledLayoutBB.hasNext(H5tiledLayoutBB.java:125)
	at ucar.nc2.iosp.IospHelper.readData(IospHelper.java:332)
	at ucar.nc2.iosp.IospHelper.readDataFill(IospHelper.java:292)
	at ucar.nc2.iosp.hdf5.H5iosp.readData(H5iosp.java:161)
	at ucar.nc2.iosp.hdf5.H5iosp.readData(H5iosp.java:134)
	at ucar.nc2.NetcdfFile.readData(NetcdfFile.java:2122)
	at ucar.nc2.Variable.reallyRead(Variable.java:817)
	at ucar.nc2.Variable._read(Variable.java:768)
	at ucar.nc2.Variable.read(Variable.java:600)
	at ucar.nc2.Variable.read(Variable.java:546)
	at loci.formats.services.NetCDFServiceImpl.getArray(NetCDFServiceImpl.java:172)
	at loci.formats.in.ImarisHDFReader.getImageData(ImarisHDFReader.java:473)
	at loci.formats.in.ImarisHDFReader.initFile(ImarisHDFReader.java:308)
	at loci.formats.FormatReader.setId(FormatReader.java:1480)
	at loci.formats.ImageReader.setId(ImageReader.java:864)
	at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:692)
	at loci.formats.tools.ImageInfo.testRead(ImageInfo.java:1048)
	at loci.formats.tools.ImageInfo.main(ImageInfo.java:1159)

With a local build of Bio-Formats using the HEAD of this PR, I have

sbesson@Sebastiens-MacBook-Pro-3 bioformats % ./tools/showinf -nopix  -nopix ~/Downloads/demoImagesLz4/CellDemoMembrane3Dlz4.ims 
Checking file format [Bitplane Imaris 5.5 (HDF)]
Initializing reader
ImarisHDFReader initializing /Users/sbesson/Downloads/demoImagesLz4/CellDemoMembrane3Dlz4.ims
Failure during the reader initialization

adding the -debug flag

sbesson@Sebastiens-MacBook-Pro-3 bioformats % ./tools/showinf -nopix  -nopix ~/Downloads/demoImagesLz4/CellDemoMembrane3Dlz4.ims -debug  
...
loci.formats.FormatException: loci.common.services.ServiceException: java.io.IOException: ucar.nc2.filter.UnknownFilterException: Unknown filter: no filter found with id 32004
	at loci.formats.in.ImarisHDFReader.getImageData(ImarisHDFReader.java:518)
	at loci.formats.in.ImarisHDFReader.initFile(ImarisHDFReader.java:332)
	at loci.formats.FormatReader.setId(FormatReader.java:1480)
	at loci.formats.ImageReader.setId(ImageReader.java:864)
	at loci.formats.ReaderWrapper.setId(ReaderWrapper.java:692)
	at loci.formats.tools.ImageInfo.testRead(ImageInfo.java:1048)
	at loci.formats.tools.ImageInfo.main(ImageInfo.java:1158)
Caused by: loci.common.services.ServiceException: java.io.IOException: ucar.nc2.filter.UnknownFilterException: Unknown filter: no filter found with id 32004
	at loci.formats.services.NetCDFServiceImpl.getArray(NetCDFServiceImpl.java:183)
	at loci.formats.in.ImarisHDFReader.getImageData(ImarisHDFReader.java:514)
	... 6 common frames omitted
Caused by: java.io.IOException: ucar.nc2.filter.UnknownFilterException: Unknown filter: no filter found with id 32004
	at ucar.nc2.internal.iosp.hdf5.H5tiledLayoutBB.<init>(H5tiledLayoutBB.java:90)
	at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:226)
	at ucar.nc2.internal.iosp.hdf5.H5iospNew.readData(H5iospNew.java:204)
	at ucar.nc2.NetcdfFile.readData(NetcdfFile.java:2122)
	at ucar.nc2.Variable.reallyRead(Variable.java:817)
	at ucar.nc2.Variable._read(Variable.java:768)
	at ucar.nc2.Variable.read(Variable.java:600)
	at ucar.nc2.Variable.read(Variable.java:546)
	at loci.formats.services.NetCDFServiceImpl.getArray(NetCDFServiceImpl.java:175)
	... 7 common frames omitted
Caused by: ucar.nc2.filter.UnknownFilterException: Unknown filter: no filter found with id 32004
	at ucar.nc2.filter.Filters.getFilter(Filters.java:80)
	at ucar.nc2.internal.iosp.hdf5.H5tiledLayoutBB.<init>(H5tiledLayoutBB.java:88)
	... 15 common frames omitted

The other sample retinalz4.ims fails with the same stack trace so there still seems to be some issue with the registration of the custom LZ4 filter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants