-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LVAE dataset: add MultiFileDset #223
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #223 +/- ##
==========================================
- Coverage 87.20% 86.97% -0.24%
==========================================
Files 131 132 +1
Lines 4956 4882 -74
==========================================
- Hits 4322 4246 -76
- Misses 634 636 +2 ☔ View full report in Codecov by Sentry. |
ashesh-0
requested changes
Aug 26, 2024
4 tasks
CatEek
reviewed
Aug 28, 2024
ashesh-0
approved these changes
Sep 4, 2024
4 tasks
jdeschamps
added a commit
that referenced
this pull request
Sep 9, 2024
#238) ### Description - **What**: Add option in `XYFilp` and `XYRandomRotate90` to transform additional arrays identically to the patch and target. Include the ability to compose these transforms. Also fixed some type hint issues, but some remain. - **Why**: The LVAE training requires the identical transformation of multiple arrays, including images and noise arrays. Previously it was implemented with the `Albumentations` library. Original motivation came from this comment chain #223 (comment). - **How**: A few options were considered, but ultimately I went with what I thought would require the least amount of the code base to change. Additional arrays can passed to the transform `__call__` function as kwargs, and they will be returned as a dict after the augmented patch and target. Composition of transforms with additional arrays has been kept as a separate method, `Compose.transform_with_additional_arrays`, mostly to not have to change too much of the code base, this can be revised and integrated with `Compose.__call__` later if desired. ### Changes Made - **Added**: - Tests for the new functionality. - `_chain_transforms_additional_arrays` and `transform_with_additional_arrays` methods to `Compose`. - **Modified**: - add `**additional_arrays: NDArray` argument to the `__call__` method of some transforms. - A few existing tests that needed to modify the unpacking of transform outputs, now that they output a tuple of 3 rather than 2. ### Breaking changes Code calling transforms directly, the output is now a tuple of 3 rather than 2. ### Additional Notes and Examples #### Note One of the biggest pain points that hasn't been fully resolved is that the `N2VManipulate` transform doesn't have the same function signature in the `__call__` method as the other transforms, it returns 3 arrays rather than 2. We know it gets applied last in the algorithm but we could make it more explicit somehow, this would resolve some annoying type hinting. #### Converting LVAE transform code https://github.com/CAREamics/careamics/blob/b008187292f269455272c92c9009d45a330e7ca0/src/careamics/lvae_training/dataset/vae_dataset.py#L867-L881 To convert the code snippet above, I think (not tested) the following can be done, assuming `self.rotation_transform` is created using the CAREamics `Compose` instead of that from Albumentations: ```python img_kwargs = {} for i, img in enumerate(img_tuples): for k in range(len(img)): img_kwargs[f"img{i}_{k}"] = img[k] noise_kwargs = {} for i, nimg in enumerate(noise_tuples): for k in range(len(nimg)): noise_kwargs[f"noise{i}_{k}"] = nimg[k] img, _, rot_dic = self._rotation_transform.transform_with_additional_arrays( img_tuples[0][0], **img_kwargs, **noise_kwargs ) ``` --- **Please ensure your PR meets the following requirements:** - [x] Code builds and passes tests locally, including doctests - [x] New tests have been added (for bug fixes/features) - [x] Pre-commit passes - [ ] PR to the documentation exists (for bug fixes / features) --------- Co-authored-by: Joran Deschamps <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
57cad67
from the main branch), adding theMultiFileDset
class,TilingMode
andEmptyPatchFetcher
MultiFileDset
is necessary for the datasets in the paperChanges Made
MultiFileDset
class and related classesTilingMode
and updated related code inMultiChDloader
EmptyPatchFetcher
MultiChDloader
data_utils.py
MultiFileDset
object initializationlvae_training/dataset/configs
GridIndexManager
andIndexSwitcher
to separate files inlvae_training/dataset/utils
data_utils.py
and testsPlease ensure your PR meets the following requirements: