Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply stricter .eval_time checking for dynamic survival metrics #468

Merged
merged 6 commits into from
Dec 14, 2023

Conversation

EmilHvitfeldt
Copy link
Member

to close #462

library(yardstick)

brier_survival(lung_surv, truth = surv_obj, .pred)
#> # A tibble: 5 × 4
#>   .metric        .estimator .eval_time .estimate
#>   <chr>          <chr>           <dbl>     <dbl>
#> 1 brier_survival standard          100     0.109
#> 2 brier_survival standard          200     0.194
#> 3 brier_survival standard          300     0.219
#> 4 brier_survival standard          400     0.222
#> 5 brier_survival standard          500     0.197

roc_auc_survival(lung_surv, truth = surv_obj, .pred)
#> # A tibble: 5 × 4
#>   .metric          .estimator .eval_time .estimate
#>   <chr>            <chr>           <dbl>     <dbl>
#> 1 roc_auc_survival standard          100     0.659
#> 2 roc_auc_survival standard          200     0.679
#> 3 roc_auc_survival standard          300     0.688
#> 4 roc_auc_survival standard          400     0.648
#> 5 roc_auc_survival standard          500     0.662

roc_curve_survival(lung_surv, truth = surv_obj, .pred)
#> # A tibble: 650 × 4
#>    .threshold sensitivity specificity .eval_time
#>         <dbl>       <dbl>       <dbl>      <dbl>
#>  1   -Inf          0            1            100
#>  2      0.615      0            0.995        100
#>  3      0.723      0.0334       0.985        100
#>  4      0.727      0.133        0.985        100
#>  5      0.730      0.167        0.985        100
#>  6      0.732      0.233        0.980        100
#>  7      0.739      0.233        0.969        100
#>  8      0.741      0.267        0.964        100
#>  9      0.746      0.267        0.959        100
#> 10      0.748      0.267        0.954        100
#> # ℹ 640 more rows

brier_survival_integrated(lung_surv, truth = surv_obj, .pred)
#> # A tibble: 1 × 3
#>   .metric                   .estimator .estimate
#>   <chr>                     <chr>          <dbl>
#> 1 brier_survival_integrated standard       0.158

lung_surv_neg <- lung_surv
lung_surv_neg$.pred[[1]]$.eval_time[1] <- -100

brier_survival(lung_surv_neg, truth = surv_obj, .pred)
#> Error in `brier_survival()`:
#> ✖ Negative values of .eval_time are not allowed.
#> ℹ The following negative values were found: -100.

roc_auc_survival(lung_surv_neg, truth = surv_obj, .pred)
#> Error in `roc_auc_survival()`:
#> ✖ Negative values of .eval_time are not allowed.
#> ℹ The following negative values were found: -100.

roc_curve_survival(lung_surv_neg, truth = surv_obj, .pred)
#> Error in `roc_curve_survival()`:
#> ✖ Negative values of .eval_time are not allowed.
#> ℹ The following negative values were found: -100.

brier_survival_integrated(lung_surv_neg, truth = surv_obj, .pred)
#> Error in `brier_survival_integrated()`:
#> ✖ Negative values of .eval_time are not allowed.
#> ℹ The following negative values were found: -100.

lung_surv_na <- lung_surv
lung_surv_na$.pred[[1]]$.eval_time[1] <- NA

brier_survival(lung_surv_na, truth = surv_obj, .pred)
#> Error in `brier_survival()`:
#> ✖ Missing values in .eval_time are not allowed.

roc_auc_survival(lung_surv_na, truth = surv_obj, .pred)
#> Error in `roc_auc_survival()`:
#> ✖ Missing values in .eval_time are not allowed.

roc_curve_survival(lung_surv_na, truth = surv_obj, .pred)
#> Error in `roc_curve_survival()`:
#> ✖ Missing values in .eval_time are not allowed.

brier_survival_integrated(lung_surv_na, truth = surv_obj, .pred)
#> Error in `brier_survival_integrated()`:
#> ✖ Missing values in .eval_time are not allowed.

lung_surv_inf <- lung_surv
lung_surv_inf$.pred[[1]]$.eval_time[1] <- Inf

brier_survival(lung_surv_inf, truth = surv_obj, .pred)
#> Error in `brier_survival()`:
#> ✖ Infinite values of .eval_time are not allowed.

roc_auc_survival(lung_surv_inf, truth = surv_obj, .pred)
#> Error in `roc_auc_survival()`:
#> ✖ Infinite values of .eval_time are not allowed.

roc_curve_survival(lung_surv_inf, truth = surv_obj, .pred)
#> Error in `roc_curve_survival()`:
#> ✖ Infinite values of .eval_time are not allowed.

brier_survival_integrated(lung_surv_inf, truth = surv_obj, .pred)
#> Error in `brier_survival_integrated()`:
#> ✖ Infinite values of .eval_time are not allowed.

lung_surv_duplicate <- lung_surv
lung_surv_duplicate$.pred[[1]]$.eval_time[1] <- 200

brier_survival(lung_surv_duplicate, truth = surv_obj, .pred)
#> Error in `brier_survival()`:
#> ✖ Duplicate values of .eval_time are not allowed.

roc_auc_survival(lung_surv_duplicate, truth = surv_obj, .pred)
#> Error in `roc_auc_survival()`:
#> ✖ Duplicate values of .eval_time are not allowed.

roc_curve_survival(lung_surv_duplicate, truth = surv_obj, .pred)
#> Error in `roc_curve_survival()`:
#> ✖ Duplicate values of .eval_time are not allowed.

brier_survival_integrated(lung_surv_duplicate, truth = surv_obj, .pred)
#> Error in `brier_survival_integrated()`:
#> ✖ Duplicate values of .eval_time are not allowed.

lung_surv_order <- lung_surv
lung_surv_order$.pred <- purrr::map(lung_surv_order$.pred, dplyr::arrange, desc(.eval_time))

brier_survival(lung_surv_order, truth = surv_obj, .pred)
#> Error in `brier_survival()`:
#> ✖ Values of .eval_time must be in increasing order.

roc_auc_survival(lung_surv_order, truth = surv_obj, .pred)
#> Error in `roc_auc_survival()`:
#> ✖ Values of .eval_time must be in increasing order.

roc_curve_survival(lung_surv_order, truth = surv_obj, .pred)
#> Error in `roc_curve_survival()`:
#> ✖ Values of .eval_time must be in increasing order.

brier_survival_integrated(lung_surv_order, truth = surv_obj, .pred)
#> Error in `brier_survival_integrated()`:
#> ✖ Values of .eval_time must be in increasing order.

Created on 2023-12-13 with reprex v2.0.2

@EmilHvitfeldt
Copy link
Member Author

There is no NEWS bullet since these functions are not official yet

Copy link
Member

@hfrick hfrick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! One important question about the ordering of eval_time though, see the comments 🙌

cli::cli_abort(
c(
x = "Negative values of {.field .eval_time} are not allowed.",
i = "The following negative values were found: {.val {offenders}}."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice 👍

R/validation.R Outdated
Comment on lines 252 to 262
any_not_in_order <- any(
vapply(all_eval_times_list, function(x) is.unsorted(x), logical(1))
)
if (any_not_in_order) {
cli::cli_abort(
c(
x = "Values of {.field .eval_time} must be in increasing order."
),
call = call
)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In tune, we explicitly make use of the order of time points in eval_time: for Bayesian optimization and the tuning functions in finetune (racing etc), we can only optimize for one time point so the functions use the first one. If you want to optimize for 10 but also want the metrics to be calculated for 5, you would set eval_time = c(10, 5). Is the order a requirement from yardstick? If so, we need to decide where the re-ordering should happen. If not, we can just remove this check.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets remove the check + add tests to make sure that out of order eval_time works in {yardstick}

@EmilHvitfeldt EmilHvitfeldt merged commit ce9da1d into main Dec 14, 2023
12 checks passed
@EmilHvitfeldt EmilHvitfeldt deleted the fix462 branch December 14, 2023 19:03
Copy link

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dealing with improper survival input
2 participants