[WIP][Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss` #551

hongpeng-guo · 2025-01-30T04:34:59Z

Summary

In RLHF workflows, such as verl, the actor forward function usually generates both losses of cross_entropy_loss (-log_probs) and entropy_loss, the later was used to encourage the policy to be not over-deterministic.

There is a real needs for a kernel that will generates both the two losses, with materializing the huge logits tensor. Liger-kernel's fused_linear_cross_entropy_loss already works well to generate the cross_entropy_loss, but only calculating the second part of the loss, i.e., the entropy loss.

This PR adds the entropy loss option to the existing FLCE loss, and work as one important step to support verl.

Adding the entropy calculation in the second pass of online softmax in cross_entropy.py::liger_cross_entropy_kernel, both the loss and its gradient subject to input are calculated and stored;
Propagate the changes to relevant modules in fused_linear_cross_entropy.py,
Propagate relavent changes to other functional modules in PyTorch interface.

Testing Done

Made existing unit tests working; Adding new unittest WIP.

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

Signed-off-by: Hongpeng Guo <[email protected]>

Tcc0403 · 2025-01-30T11:15:43Z

Please add a unit test with return_entropy_loss. You can write a new pytorch implementation like CrossEntropyWithZLoss, or return_entropy_loss functionality on top of it.

hongpeng-guo added 6 commits January 30, 2025 02:52

run make checkstyle

05f0edb

Signed-off-by: Hongpeng Guo <[email protected]>

wip initial try test existing unitest

6a26dbb

Signed-off-by: Hongpeng Guo <[email protected]>

ruff style check

7dad560

Signed-off-by: Hongpeng Guo <[email protected]>

fix for cross_entropy

1b13b2f

Signed-off-by: Hongpeng Guo <[email protected]>

fix checkstyle

8a43d1e

Signed-off-by: Hongpeng Guo <[email protected]>

wip fix flce

82d9b55

Signed-off-by: Hongpeng Guo <[email protected]>

hongpeng-guo marked this pull request as draft January 30, 2025 04:38

hongpeng-guo changed the title ~~[Feature] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss~~ [WIP][Feature][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss Jan 30, 2025

hongpeng-guo changed the title ~~[WIP][Feature][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss~~ [WIP][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss Jan 30, 2025

hongpeng-guo mentioned this pull request Jan 30, 2025

[Liger-kernel] Add an option to use _apply_liger_kernel_to_instance() to load model volcengine/verl#133

Merged

hongpeng-guo added 4 commits January 30, 2025 08:04

fix bugs

984e85f

Signed-off-by: Hongpeng Guo <[email protected]>

fix bugs

eb90401

Signed-off-by: Hongpeng Guo <[email protected]>

fix

7684eed

Signed-off-by: Hongpeng Guo <[email protected]>

fix a unit test

bed2d45

Signed-off-by: Hongpeng Guo <[email protected]>

hongpeng-guo requested a review from ByronHsu January 30, 2025 09:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss` #551

[WIP][Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss` #551

hongpeng-guo commented Jan 30, 2025

Tcc0403 commented Jan 30, 2025

[WIP][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss #551

Are you sure you want to change the base?

[WIP][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss #551

Conversation

hongpeng-guo commented Jan 30, 2025

Summary

Testing Done

Tcc0403 commented Jan 30, 2025

[WIP][Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss` #551

[WIP][Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss` #551