Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md

README.md

Adding Bells and Whistles to the Training Loop

The main part used a relatively simple training function to keep the code readable and fit Part 4 within the page limits. Optionally, we can add a linear warm-up, a cosine decay schedule, and gradient clipping to improve the training stability and convergence.

You can find the code for this more sophisticated training function in Appendix B: Adding Bells and Whistles to the Training Loop.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

04_learning_rate_schedulers

04_learning_rate_schedulers

README.md

Adding Bells and Whistles to the Training Loop

Files

04_learning_rate_schedulers

Directory actions

More options

Directory actions

More options

Latest commit

History

04_learning_rate_schedulers

Folders and files

parent directory

README.md

Adding Bells and Whistles to the Training Loop