CrossEntropyLoss return single value when reduction is "none" #421

leng-yue · 2024-12-03T01:17:09Z

🐛 Describe the bug

When using LigerCrossEntropyLoss with reduction="none", it returns a single value, instead of an array.

Reproduce

import torch
from liger_kernel.transformers.cross_entropy import LigerCrossEntropyLoss


def test_cross_entropy_matches_torch():
    # Test cases with various shapes and values
    batch_size = 3
    num_classes = 5
    seq_length = 4

    # Generate random logits and labels using torch
    torch.manual_seed(42)
    logits = torch.randn(batch_size, seq_length, num_classes, device='cuda')
    labels = torch.randint(0, num_classes, (batch_size, seq_length), device='cuda')

    # Set some labels to -100 to ignore them
    labels[0, -1] = -100  # Ignore last token of first sequence
    labels[1, -2:] = -100  # Ignore last two tokens of second sequence

    # Liger cross entropy
    liger_ce = LigerCrossEntropyLoss(reduction="none")
    liger_loss = liger_ce(logits.view(-1, num_classes), labels.view(-1))
    print(liger_loss)

    # PyTorch cross entropy
    torch_ce = torch.nn.CrossEntropyLoss(reduction="none")
    torch_loss = torch_ce(logits.view(-1, num_classes), labels.view(-1))

    # Check if losses match
    torch.testing.assert_close(
        liger_loss,
        torch_loss,
        rtol=1e-5,
        atol=1e-5,
        msg="Liger and PyTorch losses don't match",
    )

    print("All tests passed! LigerCrossEntropyLoss matches PyTorch's implementation")


if __name__ == "__main__":
    test_cross_entropy_matches_torch()

Versions

Environment Report:

Operating System: Linux-5.15.0-88-generic-x86_64-with-glibc2.35
Python version: 3.10.15
PyTorch version: 2.5.1+cu124
CUDA version: 12.4
Triton version: 3.1.0
Transformers version: 4.46.2

The text was updated successfully, but these errors were encountered:

ByronHsu · 2024-12-03T23:23:16Z

Sorry. We only support "mean" and "sum" for now. Are you willing to contribute by adding an assertion or implementing "none" reduction? Thanks!

leng-yue · 2024-12-04T00:16:02Z

I think I can make a PR to add an assert to block "none" first, then let's see how to implement the "none" reduction one.

Tcc0403 · 2024-12-05T10:06:53Z

The triton kernel of liger ce is actually giving none reduction loss, just need to make a reduction=="none" condition to output loss1d directly without torch.sum()

Liger-Kernel/src/liger_kernel/ops/cross_entropy.py

Line 288 in 79b940f

loss = torch.sum(loss_1d)

hebiao064 · 2024-12-08T01:50:18Z

#take I made a PR about this issue, please take a look, thanks @ByronHsu @leng-yue @Tcc0403

leng-yue · 2024-12-09T10:01:28Z

Looks good to me

hebiao064 mentioned this issue Dec 8, 2024

Fix LigerCrossEntropyLoss Reduction Behavior for "None" Mode #435

Merged

3 tasks

ByronHsu closed this as completed in #435 Dec 10, 2024

ByronHsu closed this as completed in d790b64 Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CrossEntropyLoss return single value when reduction is "none" #421

CrossEntropyLoss return single value when reduction is "none" #421

leng-yue commented Dec 3, 2024

ByronHsu commented Dec 3, 2024

leng-yue commented Dec 4, 2024

Tcc0403 commented Dec 5, 2024 •

edited

Loading

hebiao064 commented Dec 8, 2024

leng-yue commented Dec 9, 2024

CrossEntropyLoss return single value when reduction is "none" #421

CrossEntropyLoss return single value when reduction is "none" #421

Comments

leng-yue commented Dec 3, 2024

🐛 Describe the bug

Reproduce

Versions

Environment Report:

ByronHsu commented Dec 3, 2024

leng-yue commented Dec 4, 2024

Tcc0403 commented Dec 5, 2024 • edited Loading

hebiao064 commented Dec 8, 2024

leng-yue commented Dec 9, 2024

Tcc0403 commented Dec 5, 2024 •

edited

Loading