Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Experiment with alternative schedulers for scheduler_g and scheduler_d #945

Open
Bebra777228 opened this issue Jan 10, 2025 · 5 comments
Labels
enhancement New feature or request feature

Comments

@Bebra777228
Copy link
Contributor

Description

In the train.py file, there is a section for initializing schedulers. Have you tried replacing the ExponentialLR class with a different scheduler, such as CosineAnnealingLR or ReduceLROnPlateau, or any other scheduler?

I believe it would be beneficial to experiment with different scheduler classes. There might be a better option than ExponentialLR that could improve the training process.

Problem

Proposed Solution

Alternatives Considered

@Bebra777228 Bebra777228 added enhancement New feature or request feature labels Jan 10, 2025
@blaisewf
Copy link
Member

“I believe it would be beneficial to experiment with different scheduler classes” Can you share any result of this? Not tested by us atm

@Bebra777228
Copy link
Contributor Author

Unfortunately, I haven't conducted any tests and I don't have any results. However, when I was reviewing the code and saw that you had set up the option to choose different optimizers, I had an idea to try changing the learning rate scheduler as well.

In the lr_scheduler.py file, I found a variety of different schedulers. I started reading the descriptions of each one, and two particularly caught my attention: CosineAnnealingLR and ReduceLROnPlateau.

Of course, I didn't review all the available options; there might be something better. But out of the ones I read, these two stood out to me. Yes, the descriptions might be a bit exaggerated, but why not give them a try? :)


short descriptions (unofficial)

ExponentialLR

Gradually decreases the learning rate following an exponential function with each epoch.

CosineAnnealingLR

Smoothly decreases the learning rate following a cosine curve, which helps stabilize training.

ReduceLROnPlateau

Reduces the learning rate when the model's performance metric stops improving, helping to avoid getting stuck in local minima.

@blaisewf
Copy link
Member

could you try it an share your feelings?

@Bebra777228
Copy link
Contributor Author

Sure, I'll give it a try 👌

@AznamirWoW
Copy link
Contributor

AdamW may benefit from some warmup scheduler, RAdam does not need it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature
Projects
None yet
Development

No branches or pull requests

3 participants