Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential missed inversion of discounts within CUDA implementation of LambdarankNDCG #6847

Open
DJ-Hayden opened this issue Feb 28, 2025 · 0 comments

Comments

@DJ-Hayden
Copy link

DJ-Hayden commented Feb 28, 2025

Description

Hello, I believe there might be a missed discount inversion within the CUDA implementation of LightGBM's LambdarankNDCG rank objective class? I'm unfortunately better at math than optimized C++/CUDA so I may be missing something simple. I'll try to link to the appropriate code/lines to justify my thoughts.

  1. Within the normal, non-CUDA implementation of GetGradientsForOneQuery within rank_objective.hpp here there are two calls to DCGCalculator::GetDiscount to get the discounts.

  2. I believe those calls go to DCGCalculator::GetDiscount here, which looks up pre-computed discounts.

  3. Those pre-computed discounts seem to occur here during the DCGCalculator::Init call. Notably, the discounts are 1.0 / std::log2(2.0 + i), which aligns with the math (that the discounts are inverted).

  4. Within the CUDA implementation of GetGradientsKernel_LambdarankNDCG within cuda_rank_objective.cu here the discounts are not precomputed.

  5. Instead, each discount here and here directly compute the discount as log2(2.0f + i).

It does not appear to invert the discount later in the code. In fact, both the non-CUDA and CUDA implementation are nigh-identical within that lambda gradient calculation loop (besides the above inversion discrepancy and a pre-computed sigmoid lookup table).

Reproducible example

The code runs, trains, and seemingly works fine even without the inversion. So in all likelihood I might just be misunderstanding something.

Environment info

LightGBM Version 4.6.0

Additional Comments

Since the ranks should always be 0+, I believe the bug can be fixed with simple inversions within the function:

  • const double high_discount = log2(2.0f + high_rank); -> const double high_discount = 1.0f / log2(2.0f + high_rank);
  • const double low_discount = log2(2.0f + low_rank); -> const double low_discount = 1.0f / log2(2.0f + low_rank);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant