Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work-around for breaking change in CellSens 4.1 #4117

Merged
merged 4 commits into from
Apr 5, 2024

Conversation

ed-scanlon
Copy link
Contributor

Missing metadata from .vsi file is causing exception when parsing .ets files. Added logic to dynamically associate .efs files with .vsi file metadata.

I have submitted a detailed write-up in Issue #4116.

….vsi file was causing exception when parsing .ets file. Added logic to dynamically associate .efs files with .vsi file metadata.
@melissalinkert
Copy link
Member

Thank you for the detailed write-up and proposed fix, @ed-scanlon. I don't see anything immediately concerning in code review, so am including in nightly builds to assess impact on our existing data.

I do have a private sample dataset (not included in automated tests) that exhibited the problem in #4116. This PR does allow that dataset to be read, but note that while this PR attempts to read the extra frame*.ets files, Olympus/Evident's OlyVIA software appears to ignore these files.

It would be good to understand what the extra frame*.ets files represent - are these legitimate acquisitions that were incorrectly omitted from the .vsi file? Or are they invalid/duplicate/deleted acquisitions that were correctly removed from the .vsi and the .ets file was just not deleted?

@ed-scanlon
Copy link
Contributor Author

Hi Melissa, thank you for reviewing my write-up, and for including the proposed work-around in the nightly builds. I have just a few more comments to add.

Yes, from what I can tell too, Olyvia does seem to ignore the extra files - if I remove all but one of the .ets files, Olyvia will still display everything the same way as when they're all present (the trick is knowing which ones to remove and which one to keep).

My proposed work-around is based on the very limited information I have, with the main unknown being the very question you wondered too, why are these extra frame_*.ets files present? Since I didn't know, it was hard for me to make the decision to just totally ignore them.

As for determining which .ets file is the correct one to use, picking the one with the best matching width and height may work well, but I have no way to know how well that method would hold up over many samples. A more ideal situation would be if we could find something within each EXTERNAL_FILE_PROPERTIES metadata block that would explicitly indicate exactly which .ets file it is referencing. (Unfortunately I have no more information about the format and contents of the .vsi file than I have ascertained by reading the bioformats code.) If anyone could find such information then it may be possible to simply construct the actual path to each .ets file from the associated metadata block. In that case it would no longer be necessary to perform directory scan for finding .ets files, and we'd have an actual "fix" instead of just a "work-around" for this problem.

…o associated metadata block in the .vsi file, its contents are omitted from the series list, and the file is treated the same way as the blob_*.meta files, which is to say, their names are included in the 'extraFiles' list, and they are otherwise ignored.
@ed-scanlon
Copy link
Contributor Author

Please see comment regarding the second commit here. #4116

@melissalinkert melissalinkert added this to the 7.2.0 milestone Dec 8, 2023
Copy link
Member

@melissalinkert melissalinkert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree that ignoring orphaned .ets files (but keeping them on the used file list) is the better approach; that's more consistent with how the vendor's software treats this data.

One of the two test datasets I have works as expected with this PR, but the other does not. The non-working data should have one .ets file that is removed by the block in line 788, but that doesn't appear to happen. The initialization is successful, but viewing the last several series (e.g. showinf -noflat -series 24) shows an exception similar to the original issue:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index 24 out of bounds for length 24
	at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
	at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
	at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248)
	at java.base/java.util.Objects.checkIndex(Objects.java:372)
	at java.base/java.util.ArrayList.get(ArrayList.java:459)
	at loci.formats.in.CellSensReader.getCurrentPyramid(CellSensReader.java:1070)
	at loci.formats.in.CellSensReader.openBytes(CellSensReader.java:545)
	at loci.formats.FormatReader.openBytes(FormatReader.java:922)
	at loci.formats.ImageReader.openBytes(ImageReader.java:449)
	at loci.formats.ReaderWrapper.openBytes(ReaderWrapper.java:334)
	at loci.formats.gui.BufferedImageReader.openImage(BufferedImageReader.java:86)
	at loci.formats.tools.ImageInfo.readPixels(ImageInfo.java:829)
	at loci.formats.tools.ImageInfo.testRead(ImageInfo.java:1063)
	at loci.formats.tools.ImageInfo.main(ImageInfo.java:1129)

The working dataset has series with distinct XY sizes; no two have the same X or Y dimension. The non-working dataset has several small series, some with identical XY sizes, and some that vary by less than 10 pixels in either dimension. I suspect the orphaned .ets file has the same XY size as one of the valid files, and so is incorrectly determined to have a matching metadata block.

I unfortunately can't share either dataset, but can certainly re-test with additional changes or capture the output of showinf -debug if additional debug logging is added.

@melissalinkert melissalinkert modified the milestones: 7.2.0, 8.0.0 Jan 29, 2024
@melissalinkert
Copy link
Member

@ed-scanlon : thanks again for all of your work here. In order to get this pull request merged, the comments in #4117 (review) will need to be addressed, and we'll need a signed Contributor License Agreement.

Could you please let us know if you're still working on this? If so, we'll leave this pull request open, but if not we'll plan to close it and leave #4116 open to be fixed at a later date.

@ed-scanlon
Copy link
Contributor Author

@melissalinkert : Thank you very much for message. Yes, this change is important to us and I would like to see it through. I have put it into my schedule to address the comments and post a new commit within the next couple of days. Thank you for your detailed analysis of the problem, I understand, and I have what I believe is a suitable resolution in mind.

… given 'pyramid' metadata struct is not associated with more than one ETS file. This change is to handle the case where the image sizes of multiple ETS files are the same or nearly the same as a single 'pyramid' metadata struct, resulting in an erroneous associations that cause the program to crash.
@sbesson sbesson linked an issue Mar 2, 2024 that may be closed by this pull request
@melissalinkert melissalinkert self-requested a review March 5, 2024 00:09
Copy link
Member

@melissalinkert melissalinkert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only final change I think is required is suggested inline - the fileMap variable just needs to be updated when an invalid file is found. That only causes an issue for things that call openThumbBytes; the easiest way to see this is by running showinf -thumbs which should throw something like:

Exception in thread "main" java.lang.IllegalArgumentException: Invalid series: 64
	at loci.formats.FormatReader.setCoreIndex(FormatReader.java:1430)
	at loci.formats.in.CellSensReader.openThumbBytes(CellSensReader.java:514)
	at loci.formats.ImageReader.openThumbBytes(ImageReader.java:479)
	at loci.formats.ReaderWrapper.openThumbBytes(ReaderWrapper.java:360)
	at loci.formats.gui.BufferedImageReader.openThumbImage(BufferedImageReader.java:100)
	at loci.formats.tools.ImageInfo.readPixels(ImageInfo.java:828)
	at loci.formats.tools.ImageInfo.testRead(ImageInfo.java:1063)
	at loci.formats.tools.ImageInfo.main(ImageInfo.java:1129)

With the suggested change, the same command should no longer throw IllegalArgumentException (though it may fail with OutOfMemoryError, which is far outside the scope of this pull request).

Otherwise, this looks good and now behaves as I'd expect with the test data I used previously. I'm fine with this being merged once the CLA is in place, and the above change is made and included in at least one passing nightly build.

* it and change its status to being an "extra" file.
*/
if (!validFrameFile) {
core.remove(core.size()-1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
core.remove(core.size()-1);
fileMap.remove(core.size() - 1);
core.remove(core.size()-1);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch Melissa! I tried to be thorough but that one escaped me. I place the filemap.remove statement within the parseETSFile() function; it seemed more appropriate since the file was added to the filemap from within that function. Please let me know if you encounter any other issues. Thank you.

@ed-scanlon
Copy link
Contributor Author

Thank you Melissa, we appreciate your help and collaboration on this issue. -Ed

@melissalinkert melissalinkert modified the milestones: 8.0.0, 7.3.0 Mar 11, 2024
Copy link
Member

@dgault dgault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes here look good and the logic used makes sense from my side. Tested using the new artificial sample file and with the PR included the file was able to be opened and displayed as expected without the previous exception.

Repo tests have continued to pass with this PR and the new sample file included. CLA has also been received so the PR looks good to merge.

@dgault dgault merged commit 79e97b3 into ome:develop Apr 5, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change in CellSens 4.1 causing exception in VSI format reader.
3 participants