Channel-dependent normalization #134

CatEek · 2024-06-06T10:15:22Z

Description

Input patches and targets should be normalized separately, because they may have very different pixel value ranges. Also normalization was done for all channels at once, which might not be correct for cases where 2 channels should not be mixed.

What: Added separate normalization for images and targets and per channel normalization
Why: Normalizing everything with the same stats is not correct

Changes Made

Normalization func + some changes in the datasets, training and prediction loops.
Reelevant tests

Please ensure your PR meets the following requirements:

Code builds and passes tests locally, including doctests
New tests have been added (for bug fixes/features)
Pre-commit passes
No change to the documentation needed

jdeschamps

I did the merge with the most recent main, and that introduced some errors. So I pushed some fixes now, but not all tests pass.

Mypy is definitely not passing...

The main points in the review are the following:

dataclasses use
improving the normalization with broadcasting
type declaration for the stats in the Pydantic model

src/careamics/config/data_model.py

src/careamics/dataset/dataset_utils/dataset_utils.py

src/careamics/dataset/iterable_dataset.py

src/careamics/transforms/normalize.py

tests/transforms/test_compose.py

jdeschamps · 2024-06-07T09:42:52Z

There is still one test no passing related to the BMZ, potentially because it was before using a tensor [-1, 1] with meant that the normalization was not mattering so much for the result.

Now we are again with small differences between CAREamics prediction and BMZ one. Let's understand where it comes from.

### Description > **tldr**: > - Split prediction datasets into tiled and non-tiled > - Simplify stitching by passing `TileInformation` all along > - Fix #125 > - Change `total` memory check to `available` When not tiling prediction, the images are forced through the tiling pipeline (with a `TileInformation` class being passed along the images), and it makes the debugging complex. This PR splits the prediction datasets into tiled and not-tiled. I also changed the `total` memory into `available` for the switch between in memory and iterable datasets during training, as it better represents what can be loaded in memory. Finally, I simplified the stitching and prediction pipeline by passing the `TileInformation` further. - **What**: Refactor prediction datasets into tiled and non-tiled datasets, simplify stitching. - **Why**: Avoids forcing non-tiled prediction through the same complex pipeline as the tiled one. - **How**: Split the two features (tiled and non tiled predictions) into two datasets. ### Changes Made - **Added**: - *dataset/iterable_pred_dataset.py* - *dataset/iterable_tiled_pred_dataset.py* - *dataset/in_memory_pred_dataset.py* - *dataset/in_memory_tiled_pred_dataset.py* - **Modified**: `get_ram_size` now looks at available memory. - **Removed**: Removed useless calls to `sort` in datasets, `is_tile` in `TilingInformation`. ### Related issues This PR fixes #125. ### Notes This PR will create merge issues with #134. Currently, there are two issues remaining: - `extract_tile` returns C(Z)YX, while the non tiling datasets always return SC(Z)YX with a singleton dimension - it is not clear when we should, and when, cast the `Tensors` into `np.ndarray`. I tried to make it happen in the same place (in the prediction loop), but that is not entirely solved. --- **Please ensure your PR meets the following requirements:** - [x] Code builds and passes tests locally, including doctests - [x] New tests have been added (for bug fixes/features) - [x] Pre-commit passes - [x] PR to the documentation exists (for bug fixes / features)

CatEek added 9 commits May 27, 2024 13:53

added image/target means to datamodel

abf5ecc

added image/target means to normalize func

32d468b

added channelwise means/stds for image/target for seq patching

4610c9a

tests fix

48108fe

added patches dataclass, in_mem ds channelwise stats

91ec753

compute stats simplify and tests

bb76bcf

fix 3d stats computation

ef88bfe

config tests fix

fcbedf6

no_tiling pred fix, tests passing

15ccfb8

CatEek requested review from jdeschamps and melisande-c June 6, 2024 10:15

jdeschamps and others added 2 commits June 6, 2024 16:10

Merge branch 'main' into iz/feat/separate_norm

7903ef0

style(pre-commit.ci): auto fixes [...]

0a494f4

jdeschamps requested changes Jun 6, 2024

View reviewed changes

jdeschamps added 2 commits June 6, 2024 17:21

(chore): remove tests from erroneous merge

b43e653

(chore): fix errors coming from erroneous merge

573eac9

jdeschamps changed the title ~~Iz/feat/separate norm~~ Channel-dependent normalization Jun 6, 2024

jdeschamps added 2 commits June 6, 2024 18:33

(fix): fix some of the tests

f2d8933

(fix): fix tests

0f9e62e

jdeschamps mentioned this pull request Jun 7, 2024

Refactor prediction pipeline #131

Merged

4 tasks

normalize refac, patch dataclass refac, doctrings

6644766

CatEek added 7 commits June 11, 2024 12:02

merge

67af03a

merge

4bcc390

fix tests

b8e382f

len image/target stats can be unequal

3da4325

doscting in dataset small fix

53205ff

tests update

355b4bd

norm/denorm in bmz rollback

dd30e35

CatEek and others added 12 commits June 13, 2024 12:58

mean/std calc refac + correct std avg

dfe3724

mypy passes

fb54fd2

pre-com passes

338f054

comment fix

8dade7b

(WIP)(fix): no tuple in Pydantic

d1c661b

(merge): merge main

d19aef9

(merge): merge main

4a2cb91

(fix): list of floats in model, fix normalization

178e1b9

(refac): refactor normalization and denormalization a little, add test

0d6d64c

std calc fix

695d0c0

iter mean/std calc fix + test

91fd71e

duplicate attr

9cf6956

jdeschamps self-requested a review June 14, 2024 13:58

jdeschamps approved these changes Jun 14, 2024

View reviewed changes

jdeschamps merged commit 07fb84e into main Jun 14, 2024
15 checks passed

jdeschamps deleted the iz/feat/separate_norm branch June 14, 2024 15:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Channel-dependent normalization #134

Channel-dependent normalization #134

CatEek commented Jun 6, 2024 •

edited

Loading

jdeschamps left a comment •

edited

Loading

jdeschamps commented Jun 7, 2024

Channel-dependent normalization #134

Channel-dependent normalization #134

Conversation

CatEek commented Jun 6, 2024 • edited Loading

Description

Changes Made

jdeschamps left a comment • edited Loading

Choose a reason for hiding this comment

jdeschamps commented Jun 7, 2024

CatEek commented Jun 6, 2024 •

edited

Loading

jdeschamps left a comment •

edited

Loading