[Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss` #551

hongpeng-guo · 2025-01-30T04:34:59Z

Summary

In RLHF workflows, such as verl, the actor forward function usually generates both losses of cross_entropy_loss (-log_probs) and entropy_loss, the later was used to encourage the policy to be not over-deterministic.

There is a real needs for a kernel that will generates both the two losses, without materializing the huge logits tensor. Liger-kernel's fused_linear_cross_entropy_loss already works well to generate the cross_entropy_loss, but only calculating the second part of the loss, i.e., the entropy loss.

This PR adds the entropy loss option to the existing FLCE loss, and work as one important step to support verl.

Adding the entropy calculation in the second pass of online softmax in cross_entropy.py::liger_cross_entropy_kernel, both the loss and its gradient subject to input are calculated and stored;
Propagate the changes to relevant modules in fused_linear_cross_entropy.py,
Propagate relavent changes to other functional modules in PyTorch interface.

Testing Done

Made existing unit tests working; Adding new unittest WIP.

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

Signed-off-by: Hongpeng Guo <[email protected]>

Tcc0403 · 2025-01-30T11:15:43Z

Please add a unit test with return_entropy_loss. You can write a new pytorch implementation like CrossEntropyWithZLoss, or return_entropy_loss functionality on top of it.

Signed-off-by: Hongpeng Guo <[email protected]>

hongpeng-guo · 2025-02-03T05:18:29Z

Update: Met some numerical unstable issue, inverstigating

Signed-off-by: Hongpeng Guo <[email protected]>

qingquansong

Thanks for the efforts! Let's try to test thoroughly on both accuracy (numerical stability) and speed (including old ones) before checking in. Considering we may fused more and more losses such as the existing Z loss and the added entropy loss, api outputs kind of diverged and also make the loss quite heavy with multiple branches coupling together (like label smoothing, target weights, etc) We probably need to refactor a bit to make it cleaner to dev later. cc @ByronHsu @shivam15s @Tcc0403 @shimizust

src/liger_kernel/ops/cross_entropy.py

qingquansong · 2025-02-03T06:14:57Z

src/liger_kernel/ops/cross_entropy.py

@@ -140,6 +149,26 @@ def liger_cross_entropy_kernel(
    #                    = max(X) + log (sum(e ^ (X_i - max(X)))) = m + log d
    lse = m + tl.log(d)

+    # 3.5 Calculate the entropy loss


can probably put an equation in the PR description and also a simple one in a comment here to demonstrate how the entropy_loss is calculated (especially on the reuse of m and d computed in the first pass online softmax

src/liger_kernel/ops/cross_entropy.py

qingquansong · 2025-02-03T06:56:46Z

src/liger_kernel/ops/cross_entropy.py

@@ -248,11 +299,16 @@ def liger_cross_entropy_kernel(
            loss = loss / n_non_ignore
        # TODO: Implement weighted z_loss. Currently, z_loss is not scaled by weight.
        z_loss = z_loss / n_non_ignore
+        # TODO: Implement weighted entropy loss. Currently, entropy loss is not scaled by weight.
+        entropy_loss = entropy_loss / n_non_ignore


It seems you had done the implementation of weight provided case already above? dX_entropy_block = dX_entropy_block / sum_non_ignore_weight Did I misunderstand anything? If this is not the right equation for the weighted case, please use dX_entropy_block = dX_entropy_block / n_non_ignore above and also list a comment above as an TODO item.

Thanks for catching this. I think this is a bug in my program. I just fixed it. But it seems the numerical problem is still there. Maybe we need to take a deeper look.

BTW, it seems the CI stops running for this PR for some reason.

Signed-off-by: Hongpeng Guo <[email protected]>

Co-authored-by: Qingquan Song <[email protected]>

Signed-off-by: Hongpeng Guo <[email protected]>

hongpeng-guo added 6 commits January 30, 2025 02:52

run make checkstyle

05f0edb

Signed-off-by: Hongpeng Guo <[email protected]>

wip initial try test existing unitest

6a26dbb

Signed-off-by: Hongpeng Guo <[email protected]>

ruff style check

7dad560

Signed-off-by: Hongpeng Guo <[email protected]>

fix for cross_entropy

1b13b2f

Signed-off-by: Hongpeng Guo <[email protected]>

fix checkstyle

8a43d1e

Signed-off-by: Hongpeng Guo <[email protected]>

wip fix flce

82d9b55

Signed-off-by: Hongpeng Guo <[email protected]>

hongpeng-guo marked this pull request as draft January 30, 2025 04:38

hongpeng-guo changed the title ~~[Feature] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss~~ [WIP][Feature][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss Jan 30, 2025

hongpeng-guo changed the title ~~[WIP][Feature][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss~~ [WIP][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss Jan 30, 2025

hongpeng-guo mentioned this pull request Jan 30, 2025

[Liger-kernel] Add an option to use _apply_liger_kernel_to_instance() to load model volcengine/verl#133

Merged

hongpeng-guo added 4 commits January 30, 2025 08:04

fix bugs

984e85f

Signed-off-by: Hongpeng Guo <[email protected]>

fix bugs

eb90401

Signed-off-by: Hongpeng Guo <[email protected]>

fix

7684eed

Signed-off-by: Hongpeng Guo <[email protected]>

fix a unit test

bed2d45

Signed-off-by: Hongpeng Guo <[email protected]>

hongpeng-guo requested a review from ByronHsu January 30, 2025 09:56

hongpeng-guo added 4 commits February 3, 2025 03:29

fix ce kernel, add unit test make it work

a967e65

Signed-off-by: Hongpeng Guo <[email protected]>

fix style

068b9be

Signed-off-by: Hongpeng Guo <[email protected]>

add unit test to flce

32ac203

Signed-off-by: Hongpeng Guo <[email protected]>

revert the chanegs on unit tests

201f47e

Signed-off-by: Hongpeng Guo <[email protected]>

hongpeng-guo changed the base branch from main to hpguo/ruff_style_check February 3, 2025 04:41

hongpeng-guo marked this pull request as ready for review February 3, 2025 05:17

hongpeng-guo changed the title ~~[WIP][Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss~~ [Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss Feb 3, 2025

hongpeng-guo added 2 commits February 3, 2025 05:26

improve ce unit test

38c5d44

Signed-off-by: Hongpeng Guo <[email protected]>

improve ce unit test

96c3192

Signed-off-by: Hongpeng Guo <[email protected]>

qingquansong reviewed Feb 3, 2025

View reviewed changes

hongpeng-guo and others added 3 commits February 3, 2025 09:48

handle comments partial

af84880

Signed-off-by: Hongpeng Guo <[email protected]>

Update src/liger_kernel/ops/cross_entropy.py

4307e37

Co-authored-by: Qingquan Song <[email protected]>

fix typo

8d65866

Signed-off-by: Hongpeng Guo <[email protected]>

hongpeng-guo changed the base branch from hpguo/ruff_style_check to main February 3, 2025 09:54

Merge branch 'main' into hpguo/ruff_style_check

d43d0ee

hongpeng-guo changed the base branch from main to hpguo/ruff_style_check February 3, 2025 09:55

Merge branch 'hpguo/ruff_style_check' into hpguo/lce_add_entropy_loss

5f6253b

hongpeng-guo changed the base branch from hpguo/ruff_style_check to main February 5, 2025 22:15

hongpeng-guo added 13 commits February 6, 2025 02:21

fix bug in softcap and ce weight confusion

4c97042

Signed-off-by: Hongpeng Guo <[email protected]>

fix bug in softcap and ce weight confusion

74d0f0e

Signed-off-by: Hongpeng Guo <[email protected]>

bisec unittes to test on ci

8005999

Signed-off-by: Hongpeng Guo <[email protected]>

refactor code

e341aea

Signed-off-by: Hongpeng Guo <[email protected]>

revert changes to unit tests

ced5709

Signed-off-by: Hongpeng Guo <[email protected]>

change a new way calculating entropy

c1d36e6

Signed-off-by: Hongpeng Guo <[email protected]>

make deriv stable

b1053a3

Signed-off-by: Hongpeng Guo <[email protected]>

bisect unitets

7af2fe3

Signed-off-by: Hongpeng Guo <[email protected]>

fix wip

6162e88

Signed-off-by: Hongpeng Guo <[email protected]>

try to make it numerical stable

02fd778

Signed-off-by: Hongpeng Guo <[email protected]>

wip another

7f53b59

Signed-off-by: Hongpeng Guo <[email protected]>

revert a unittest

62d2ca3

Signed-off-by: Hongpeng Guo <[email protected]>

update unittest

0d6487c

Signed-off-by: Hongpeng Guo <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss` #551

[Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss` #551

hongpeng-guo commented Jan 30, 2025 •

edited

Loading

Tcc0403 commented Jan 30, 2025

hongpeng-guo commented Feb 3, 2025

qingquansong left a comment •

edited

Loading

qingquansong Feb 3, 2025

qingquansong Feb 3, 2025

hongpeng-guo Feb 3, 2025

hongpeng-guo Feb 3, 2025

[Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss #551

Are you sure you want to change the base?

[Verl] Add entropy loss to cross_entropy_loss and fused_linear_cross_entropy_loss #551

Conversation

hongpeng-guo commented Jan 30, 2025 • edited Loading

Summary

Testing Done

Tcc0403 commented Jan 30, 2025

hongpeng-guo commented Feb 3, 2025

qingquansong left a comment • edited Loading

Choose a reason for hiding this comment

qingquansong Feb 3, 2025

Choose a reason for hiding this comment

qingquansong Feb 3, 2025

Choose a reason for hiding this comment

hongpeng-guo Feb 3, 2025

Choose a reason for hiding this comment

hongpeng-guo Feb 3, 2025

Choose a reason for hiding this comment

[Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss` #551

[Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss` #551

hongpeng-guo commented Jan 30, 2025 •

edited

Loading

qingquansong left a comment •

edited

Loading