-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for mlflow #77
Conversation
the behavior of xr.Dataset.dims will change soon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I think it all looks good now! This is ready as far as I am concerned
Thanks @SimonKamuk. I just noticed I didn't restore the defaults in the pytorch lightning trainer call in |
A small final touch.
From my perspective this is now ready. |
I've looked over this at a high-level and I think it is looking very good! I don't have the capacity for some in-depth testing right now, but I see from the discussions above that there have been multiple pairs of eyes looking at this, so I don't think it requires my detailed input again 😄 Only thing that I think would be good would be to move the CustomMLFlowLogger to its own file, and possibly _setup_training_logger to utils.py. It is quite nice to keep the list of argparse argument definitions close to the top of train_model.py. And CustomMLFlowLogger is substantial enough that it deserves its own file anyhow. Maybe we could even have a Apart from the point above I think this can be merged. |
Sounds like a plan @joeloskarsson. I will do that right away. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking really great! I tested both logging on wandb
and mlflow
and both work fine. Although I did find I had to include the protocol (i.e. https://
) in the URL otherwise the logging failed silently. You have put the protocol in the README example though :) so that's great! I think there might be a bug in mlflow here...
Anyway, I only made a few minor suggestions. The main one is documenting in the README how the credentials for AWS S3 can be defined using a shared credentials files (rather than setting environment variables for the key and secret)
Apply suggestions from review by leifdenby Co-authored-by: Leif Denby <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ready to go! 🚀
Describe your changes
Add support for mlflow logger by utilising pytorch_lightning.loggers
The native wandb module is replaced with pytorch_lightning wandb logger and introducing pytorch_lightning mlflow logger.
https://github.com/Lightning-AI/pytorch-lightning/blob/master/src/lightning/pytorch/loggers/logger.py
This will allow people to choose between wandb and mlflow.
Builds upon #66 although this is not strictly necessary for this change, but I am working with this feature to work with our dataset.
Issue Link
Closes #76
Type of change
Checklist before requesting a review
pull
with--rebase
option if possible).Checklist for reviewers
Each PR comes with its own improvements and flaws. The reviewer should check the following:
Author checklist after completed review
reflecting type of change (add section where missing):
Checklist for assignee