Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Suggestion] Serialization property for learners #891

Closed
sebffischer opened this issue Jan 24, 2023 · 3 comments
Closed

[Suggestion] Serialization property for learners #891

sebffischer opened this issue Jan 24, 2023 · 3 comments

Comments

@sebffischer
Copy link
Member

sebffischer commented Jan 24, 2023

Some learners (especiall torch learners) but also e.g. lightgbm break when saved and reloaded.
One situation where this occurs is if one calls benchmark(..., store_models = TRUE).

To avoid this / make it more comfortable for the user, I suggest adding a property "serialize".
If this property is present, a learner must implement a public method serialize() that converts the learners $state into a serialized state.

To implement that, we could save the previous $state of a learner in a private field $.state and make $state an active binding that unserializes a earner's state if it is accessed and serialized.

This allows us to hide the serialization from the user in some circumstances e.g. in the benchmark() function we can call learner$serialize() if store_models is TRUE. The user can then afterwards access the learner and does not have to call learner$unserialize() because this will automatically happen when he accesses the state.

E.g. using bundle, this might look as follows for LightGBM:

LearnerClassifLightGBM = R6Class("LearnerClassifLightGBM",
  ...,
  public = list(
    serialize = function() {
      private$.state = bundle(private.$state)
      private$.serialized = TRUE
    }
  ),
  private = list(
    .state = NULL,
    .serialized = FALSE
  ),
  active = list(
    state = function(x) {
      if (missing(x)) {
        if (private$.serialized) {
          private$.state = unbundle(private$.state)
          private$.serialized = FALSE
        }
      } else {
        private$.state = x
      }
      return(private$.state)
    }
  )
)

In addition to that, it might be convenient to offer a $save(path) method that calls $serialize() and then saveRDS()

@sebffischer
Copy link
Member Author

Note that there is also an open issue in bundle: rstudio/bundle#13

@sebffischer
Copy link
Member Author

There is also this related package: https://github.com/HenrikBengtsson/marshal

@sebffischer
Copy link
Member Author

this is implemented

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant