Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor prediction pipeline #131

Merged
merged 16 commits into from
Jun 11, 2024
Merged

Refactor prediction pipeline #131

merged 16 commits into from
Jun 11, 2024

Conversation

jdeschamps
Copy link
Member

@jdeschamps jdeschamps commented Jun 4, 2024

Description

tldr:

When not tiling prediction, the images are forced through the tiling pipeline (with a TileInformation class being passed along the images), and it makes the debugging complex. This PR splits the prediction datasets into tiled and not-tiled.

I also changed the total memory into available for the switch between in memory and iterable datasets during training, as it better represents what can be loaded in memory.

Finally, I simplified the stitching and prediction pipeline by passing the TileInformation further.

  • What: Refactor prediction datasets into tiled and non-tiled datasets, simplify stitching.
  • Why: Avoids forcing non-tiled prediction through the same complex pipeline as the tiled one.
  • How: Split the two features (tiled and non tiled predictions) into two datasets.

Changes Made

  • Added:
    • dataset/iterable_pred_dataset.py
    • dataset/iterable_tiled_pred_dataset.py
    • dataset/in_memory_pred_dataset.py
    • dataset/in_memory_tiled_pred_dataset.py
  • Modified: get_ram_size now looks at available memory.
  • Removed: Removed useless calls to sort in datasets, is_tile in TilingInformation.

Related issues

This PR fixes #125.

Notes

This PR will create merge issues with #134.

Currently, there are two issues remaining:

  • extract_tile returns C(Z)YX, while the non tiling datasets always return SC(Z)YX with a singleton dimension
  • it is not clear when we should, and when, cast the Tensors into np.ndarray. I tried to make it happen in the same place (in the prediction loop), but that is not entirely solved.

Please ensure your PR meets the following requirements:

  • Code builds and passes tests locally, including doctests
  • New tests have been added (for bug fixes/features)
  • Pre-commit passes
  • PR to the documentation exists (for bug fixes / features)

@jdeschamps jdeschamps changed the title (WIP) Refactor datasets Refactor prediction datasets Jun 5, 2024
@jdeschamps jdeschamps marked this pull request as ready for review June 7, 2024 17:10
@jdeschamps jdeschamps requested review from CatEek and melisande-c June 7, 2024 17:10
@jdeschamps jdeschamps changed the title Refactor prediction datasets Refactor prediction pipeline Jun 7, 2024
Copy link

codecov bot commented Jun 7, 2024

Codecov Report

Attention: Patch coverage is 95.92760% with 9 lines in your changes missing coverage. Please review.

Project coverage is 91.33%. Comparing base (4767329) to head (616fac0).
Report is 69 commits behind head on main.

Files Patch % Lines
...eamics/dataset/dataset_utils/iterate_over_files.py 92.85% 2 Missing ⚠️
.../careamics/dataset/in_memory_tiled_pred_dataset.py 94.59% 2 Missing ⚠️
...c/careamics/dataset/iterable_tiled_pred_dataset.py 93.93% 2 Missing ⚠️
src/careamics/config/tile_information.py 83.33% 1 Missing ⚠️
src/careamics/dataset/iterable_pred_dataset.py 96.15% 1 Missing ⚠️
src/careamics/lightning_prediction_datamodule.py 90.90% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #131      +/-   ##
==========================================
+ Coverage   81.90%   91.33%   +9.42%     
==========================================
  Files         103      104       +1     
  Lines        2841     2561     -280     
==========================================
+ Hits         2327     2339      +12     
+ Misses        514      222     -292     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@CatEek CatEek merged commit 9c829b7 into main Jun 11, 2024
20 checks passed
@CatEek CatEek deleted the jd/refac/refactor_datasets branch June 11, 2024 09:38
@melisande-c melisande-c restored the jd/refac/refactor_datasets branch June 13, 2024 13:56
@melisande-c melisande-c deleted the jd/refac/refactor_datasets branch June 13, 2024 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stitched prediction not compatible with multi channel [BUG]
2 participants