Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly handle difference between restoring only model weights and full training state #131

Open
joeloskarsson opened this issue Feb 17, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@joeloskarsson
Copy link
Collaborator

The --restore_opt argument is supposed to differentiate between restoring only the model weights, and restoring the full training session, including optimizer state (this is the crucial one) and epoch number etc. This is currently implemented as a bit of a hack:

if not self.restore_opt:
opt = self.configure_optimizers()
checkpoint["optimizer_states"] = [opt.state_dict()]
that can easily start causing problems as we build on it.

A proper way to handle this in Lightning is to differentiate between instantiating the model with load_from_checkpoint and calling Trainer.fit with ckpt_path. An example implementation that could be used is given here: joeloskarsson@e7d11c9

@joeloskarsson joeloskarsson added the enhancement New feature or request label Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant