Warmup steps #272

mikehdt · 2025-01-16T10:28:14Z

Just learning some lora training, and I'm trying to use the LR scheduler constant with warmup, with a Warmup Ratio set. However, I'm not sure the field is doing anything, or if I've used it correctly.

I'm guessing the value is 0 -> 1 where eg. 0.05 would be 5% of your steps used as warmup.

First, would this not make more sense as a field of 0 -> 100% instead of a fractional value?

Second, how can I validate that the correct number of warmup steps have been set when training starts? I don't see any value for how many warmup steps it should be using in the logs, hence why I'm not sure whether it worked or not.

Thanks!

The text was updated successfully, but these errors were encountered:

Jelosus2 · 2025-01-16T18:08:36Z

When you train you should be able to see the total warm up steps in the settings of the optimizer

mikehdt · 2025-01-19T01:18:42Z

To be clear, yes, I can see the warmup steps in the UI, albeit as a fractional number which as noted I'm assuming is meant to be between 0 -> 1. However, when I start training, there's no indication in the logs as to how many steps it's calculated for warmup.

I'm comparing this to using Kohya's GUI, which does output in the log window the calculated number of steps.

To be clear I know that it's just logging info, but without seeing what it's calculated, I'm not sure if what I've chosen is right.

Jelosus2 · 2025-01-19T09:04:41Z

No, like, literally in the scheduler parameters

uwidev · 2025-01-19T19:24:37Z

I made some modifications on my end that will allow float values to be used. Unfortunately this isn't just as simple as just modifying the scheduler as the max steps isn't passed at all on sd-script's end. I got around this by doing adding these lines of code to sd-script's library/train-util.py.

https://github.com/uwidev/sd-scripts/blob/491c32c57c406e12c548e2e267c1d0592ef6b6e1/library/train_util.py#L4429-L4438

This will parse the args, such that if it's a float, it will be a percentage of the max steps. You use the same variables.

e.g.
I set training for 100 steps.
first_cycle_max_steps = 1.0 OR 0 -> first_cycle_max_steps = 100 must be 1.0, not 1
warmup_steps = 0.1 -> warmup_steps = 10

This specific code probably won't be added to sd-scripts, but perhaps something more generic is that sd-scripts could at least pass num_training_steps.

uwidev · 2025-01-19T21:10:35Z

Just made a pull request that would allow custom schedulers, like the implemented REX, to utilize pre-calculated variables related to the scheduler. I've adjusted REX on my end to utilize those values, but until the pull request is merged, use what I posted in my previous post.

kohya-ss/sd-scripts#1883

mikehdt · 2025-01-19T22:09:09Z

No, like, literally in the scheduler parameters

Hmm. I don't see that, at least not with AdamW. Here's how I've set the optimizer, just with 30% as a number to test:

And what I get in the logs:

Interestingly I went back to KSS GUI just to compare (although I've found the generation is inexplicably slower with it, probably a setting that is set in it but not in Lora ETS, I'm not experienced with training and the parameters enough to determine though). This is what's in the log. It does show calculated steps:

But oddly once it hits the optimizer it doesn't set it at all?:

Very confusing. Again, I'm not an expert with this, maybe I've set something incorrectly?

uwidev · 2025-01-19T22:17:39Z

My bad. I read this wrong and assumed it was relating to the custom schedulers. If you use the normal schedulers, having a float (decimal) will convert the number to as a percentage of the max steps. Using an integer (whole number) will have exactly that many warmup steps.

If the warmup was displayed as a percentage, users would be locked to using a percentage rather than a static number of warmup steps. Of course, you could have a way to change the units from percent to whole number, but having a single value makes it subjectively neater on the backend.

As for validation, you will have to log the training. Make sure you set up the logging directory, and then with python, install tensorboard with
python -m pip install tensorboard

And then run tensorboard with
tensorboard --logdir <your logging directory>

It should give you a link to see the training details. There you can verify that warmup steps are correct. You need to actually run the training and wait until it hits the end of the intended warmup, then you'll know it's correct or not.

uwidev · 2025-01-20T00:47:25Z

What @Jelosus2 posted is specifically for custom schedulers. They need a specific parameter warmup_steps in optional args and do not use Warmup Ratio, which is used for built-in schedulers. constant with warmup, which is what you, @mikehdt are using, uses Warmup Ratio, and it is not displayed on console as far as I am aware.

Jelosus2 · 2025-01-20T06:22:10Z

You are right, but regardless, the warm up steps are calculated in the backend and then passed as an argument to the scripts. To know how many warmup steps you are using you can just use this formula (steps * batch size) * warmup percentage. Take in mind that gradient accumulation is counted as batch size and to get the total batch size it would be gradient accumulation * batch size and steps is the value you see in the console while training

derrian-distro · 2025-01-25T02:28:58Z

it appears that kohya doesn't display the warmup steps if provided an integer value. I have not tested how it displays in the event of providing it a ratio, but it seems like kohya supports that naturally now.

edit: it appears that nothing is actually displayed regardless, I will add a print statement so that the calculated value is visible to the end user, however it will just be standard text, so it might be a bit hard to pick out if it moves too fast

edit edit: I have decided to try and replicate the logger output so it's easier to pick out

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warmup steps #272

Warmup steps #272

mikehdt commented Jan 16, 2025

Jelosus2 commented Jan 16, 2025

mikehdt commented Jan 19, 2025 •

edited

Loading

Jelosus2 commented Jan 19, 2025 •

edited

Loading

uwidev commented Jan 19, 2025 •

edited

Loading

uwidev commented Jan 19, 2025

mikehdt commented Jan 19, 2025

uwidev commented Jan 19, 2025 •

edited

Loading

uwidev commented Jan 20, 2025

Jelosus2 commented Jan 20, 2025 •

edited

Loading

derrian-distro commented Jan 25, 2025 •

edited

Loading

Warmup steps #272

Warmup steps #272

Comments

mikehdt commented Jan 16, 2025

Jelosus2 commented Jan 16, 2025

mikehdt commented Jan 19, 2025 • edited Loading

Jelosus2 commented Jan 19, 2025 • edited Loading

uwidev commented Jan 19, 2025 • edited Loading

uwidev commented Jan 19, 2025

mikehdt commented Jan 19, 2025

uwidev commented Jan 19, 2025 • edited Loading

uwidev commented Jan 20, 2025

Jelosus2 commented Jan 20, 2025 • edited Loading

derrian-distro commented Jan 25, 2025 • edited Loading

mikehdt commented Jan 19, 2025 •

edited

Loading

Jelosus2 commented Jan 19, 2025 •

edited

Loading

uwidev commented Jan 19, 2025 •

edited

Loading

uwidev commented Jan 19, 2025 •

edited

Loading

Jelosus2 commented Jan 20, 2025 •

edited

Loading

derrian-distro commented Jan 25, 2025 •

edited

Loading