Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] Printing a booster trained with lgb.train() errors unexpectedly #6848

Open
walkerjameschris opened this issue Feb 28, 2025 · 2 comments

Comments

@walkerjameschris
Copy link

Description

It is possible to train a model using lgb.train() without specifying an objective in the list of params. However, if you try to print the booster it will fail. This is because print.lgb.Booster() checks if obj == "none" which does not play well with NULL.

library(lightgbm)

ds <-lgb.Dataset(
  data = as.matrix(mtcars[, -1])
  , label = mtcars[, 1]
)

lgb.train(
  # Empty params
  params = list()
  , data = ds
  , verbose = -1
)
#> Error in `if (obj == "none") ...`:
#> ! argument is of length zero
#> Show Traceback

The simplest thing is just to handle the possibility of NULL in print.lgb.Booster(), but it could also be addressed in lgb.train() upstream.

if (!handle_is_null) {
obj <- x$params$objective
if (obj == "none") {
obj <- "custom"
}

Happy to make a PR if you want to go in a specific direction.

@jameslamb
Copy link
Collaborator

Thanks for the report! We'd welcome a PR to fix this.

it could also be addressed in lgb.train() upstream

I'm not sure exactly what you're envisioning, but I would not want to see something like this added in lgb.train:

if (is.null(params$objective)) {
    params$objective <- "regression"
}

I'd be opposed to that because it duplicates logic that is already in the core LightGBM C/C++ library. That's already leaked a bit into the R package, e.g. here:

if (objective == "auto") {
objective <- "regression"
}

The params object in LightGBM's interface isn't really like "the full state of all configuration"... it should be thought of as "overrides of LightGBM's default configuration".

If you're familiar with REST APIs, the params object is like the body you'd attach with a PATCH request, not the one you'd attached with a POST / PUT request.

We've tried to explain that here: https://lightgbm.readthedocs.io/en/latest/Parameters.html#parameters-format

So given all that.... if I've understood correctly what you mean by "addressed in lgb.train()", then I think a PR just updating print.lgb.Booster() (and covering print(), show(), and summary() with tests) would be better. You can see the existing tests on those methods for reference:

test_that("Booster's print, show, and summary work correctly", {

I'd support pulling the internal functions up out of that one test case and re-using it across multiple test cases.

@walkerjameschris
Copy link
Author

I’ll take a look and get started soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants