0.8.0 - 2024-09-23
Breaking
- Disallow positional parameters for the learners, except for label and task.
- Remove the unsupported / invalid hyperparameters from the Isolation Forest
learner. - Remove parameters for distributed training and resuming training from
learners that do not support these capabilities. - By default,
model.analyze
for a maximum of 20 seconds (i.e.
maximum_duration=20
by default). - Convert boolean values in categorical sets to lowercase, matching the
treatment of categorical features.
Feature
- Warn if training on a VerticalDataset and fail if attempting to modify the
columns in a VerticalDataset during training. - User can override the model's task, label or group during evaluation.
- Add
num_examples_per_tree()
method to Isolation Forest models. - Expose the slow engine for debugging predictions and evaluations with
use_slow_engine=True
. - Speed-up training of GBT models by ~10%.
- Support for categorical and boolean features in Isolation Forests.
- Add
ydf.util.read_tf_record
andydf.util.write_tf_record
to facilitate
TF Record datasets usage. - Rename LAMBDA_MART_NDCG5 to LAMBDA_MART_NDCG. The old name is deprecated but
can still be used. - Allow configuring the truncation of NDCG losses.
- Enable multi-threading when using
model.predict
andmodel.evaluate
. - Default number of threads of
model.analyze
is equal to the number of
cores. - Add multi-threaded results in
model.benchmark
. - Add argument to control the maximum duration of
model.analyze
. - Add support for Unicode strings, normalize categorical set values in the
same way as categorical values, and validate their types. - Add support for distributed training for ranking gradient boosted tree
models.
Fix
- Fix labels of regression evaluation plots
- Improved errors if Isolation Forest training fails.
Release music
Perpetuum Mobile "Ein musikalischer Scherz", Op. 257. Johann Strauss (Sohn)