Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refac: Algorithm specific configurations #344

Merged
merged 17 commits into from
Jan 14, 2025
Merged

Conversation

jdeschamps
Copy link
Member

@jdeschamps jdeschamps commented Jan 9, 2025

Description

Important

TLDR: Split Pydantic configurations to simplify validation.

Following #320, this PR adds algorithm-specific main configurations, as well as algorithm and data sub-configurations when necessary (e.g. N2V) in order to simplify the code base. Additionally, it fix mypy handling of Pydantic models using the Pydantic mypy plugin. Finally, the CustomModel capacity was removed, fixing errors in the mypy report.

Pydantic mypy plugin

Natively, mypy does not have a good handle of Pydantic models. Adding the Pydantic mypy plugin yields new errors that were hidden from us (see report here).

Custom models

The custom models were added without having a real use-case, and the examples were mostly very simple toy examples. It is likely that any real use case will require more than plugging in a model in the CAREamist API, and will probably start with using one's own Lightning module in the Lightning API.

In addition, the presence of unknown Pydantic models creates issues with mypy (see previous point), and adds complexity in the code for a feature we are not sure will be used. This PR removes the CustomModel class and all mechanism to register models.

Splitting configurations

Another pain point in the code base (#320) is that having one single Configuration for vastly different algorithms (and their constraints) generates complex validation code: validation across sub-configurations, many if else statements, and long files with tons of field_validators.

An elegant solution is to define a general configuration and sub-class it with algorithm-specific Pydantic models. Then, a "factory" model can leverage Pydantic model selection to choose the correct algorithm-specific configuration based on the parameters. Here is an illustration:

class Configuration(BaseModel):
    # some example parameters
    version: Literal["0.1.0"] = "0.1.0"
    algorithm_config: Union[UNetBasedAlgorithm, VAEBasedAlgorithm]
    data_config: Union[DataConfig, N2VDataConfig]

class N2VConfiguration(Configuration):
    algorithm_config: UNetBasedAlgorithm
    data_config: N2VDataConfig

class HDN(Configuration):
    algorithm_config: VAEBasedAlgorithm
    data_config: DataConfig

In this scenario, all the validation related to the N2VManipulate is in N2VDataConfig and N2VConfiguration can freely assume that the correct transforms are present. Similarly, DataConfig does not need to care about N2VManipulate, and other subclasses can exclude it from the allowed types in its transforms. In the same spirit, the algorithm config is always UNet for N2V, and VAE for HDN. In this PR, we followed that idea with algorithm specific configurations (both for the main configuration and the algorithm sub configuration).

The main issue raised by this approach is that Configuration can no longer be used to instantiate the correct configuration, but rather configuration_factory should be used:

my_cfg = {
    ...
}

# old way
cfg = Configuration(**my_cfg)

# new approach
cfg = configuration_factory(my_cfg)

Here, the model choice (N2VConfiguration, CAREConfiguration, etc...) is delegated to Pydantic via the configuration_factory.

Similar factories have been created for the algorithm and data sub-configurations in order to allow correct instantiation in the Lightning API.

Changes Made

  • Added:
    • N2VConfiguration: main configuration for N2V
    • N2NConfiguration: idem for N2N
    • CAREConfiguration: idem for CARE
    • N2VDataConfig: data subconfiguration for N2V (enforced N2VManipulate)
    • N2VAlgorithm: algorithm subconfiguration for N2V
    • N2NAlgorithm: idem for N2N
    • CAREAlgorithm: idem for CARE
    • ConfigurationFactory: a Pydantic model instantiating the correct configuration
    • AlgorithmFactory: idem for algorithm subconfigurations
    • DataFactory: idem for data subconfigurations
    • configuration_io.py: read/save methods for configurations, refactored from configuration_factory.py
  • Modified:
    • All calls to Configuration and Configuration itself.
    • Renamed UNet and VAE algorithm configurations.
  • Removed: references module, all calls to has_n2v_manipulate, anything to do with CustomModel.

Warning

Testing could be improve further and there are probably some aspects that I am missing. But I'd rather get reviews already!

Related Issues

Breaking changes

  • Configuration, algorithm subconfig, and data subconfig should be instantiated via the factories.
  • Renaming impacts the algorithm subconfigurations (UNetAlgorithm, VAEAlgorithm).
  • CustomModel is now completely removed.
  • Calls to has_n2v_manipulate no longer work, prefer isinstance(cfg, N2VConfiguration).

Warning

Any current work on adding new algorithm and configuration will be largely impacted by this PR. @CatEek


Please ensure your PR meets the following requirements:

  • Code builds and passes tests locally, including doctests
  • New tests have been added (for bug fixes/features)
  • Pre-commit passes
  • PR to the documentation exists (for bug fixes / features)

@jdeschamps jdeschamps mentioned this pull request Jan 9, 2025
4 tasks
@jdeschamps jdeschamps marked this pull request as ready for review January 10, 2025 12:22
@jdeschamps jdeschamps requested review from CatEek and melisande-c and removed request for CatEek January 10, 2025 12:22
Copy link
Member

@melisande-c melisande-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Will def make maintenance easier in future. Added a few small comments and suggestions. (I didn't look at tests yet)

src/careamics/config/algorithms/unet_algorithm_model.py Outdated Show resolved Hide resolved
src/careamics/config/configuration_factory.py Outdated Show resolved Hide resolved
src/careamics/config/configuration_io.py Outdated Show resolved Hide resolved
src/careamics/config/data/data_model.py Show resolved Hide resolved
src/careamics/lightning/train_data_module.py Outdated Show resolved Hide resolved
src/careamics/config/data/n2v_data_model.py Outdated Show resolved Hide resolved
src/careamics/config/data/n2v_data_model.py Show resolved Hide resolved
Copy link

codecov bot commented Jan 11, 2025

Codecov Report

Attention: Patch coverage is 90.43062% with 40 lines in your changes missing coverage. Please review.

Project coverage is 87.66%. Comparing base (b29fc6c) to head (ddecddc).
Report is 25 commits behind head on main.

Files with missing lines Patch % Lines
src/careamics/config/n2v_configuration.py 81.92% 15 Missing ⚠️
src/careamics/config/configuration.py 88.23% 6 Missing ⚠️
src/careamics/config/data/n2v_data_model.py 88.46% 6 Missing ⚠️
src/careamics/config/care_configuration.py 75.00% 5 Missing ⚠️
...areamics/config/algorithms/care_algorithm_model.py 76.92% 3 Missing ⚠️
...careamics/config/algorithms/n2n_algorithm_model.py 76.92% 3 Missing ⚠️
...areamics/config/algorithms/unet_algorithm_model.py 94.11% 1 Missing ⚠️
src/careamics/config/configuration_io.py 95.23% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #344      +/-   ##
==========================================
- Coverage   87.80%   87.66%   -0.15%     
==========================================
  Files         137      144       +7     
  Lines        4971     5049      +78     
==========================================
+ Hits         4365     4426      +61     
- Misses        606      623      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@melisande-c melisande-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still haven't looked at tests but I'm sure they're good 😄

@jdeschamps jdeschamps merged commit 2bdd021 into main Jan 14, 2025
18 checks passed
@jdeschamps jdeschamps deleted the jd/feat/refac_config branch January 14, 2025 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better pydantic configuration organization
3 participants