Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

different errors with augmenting survival models #209

Closed
topepo opened this issue Nov 9, 2023 · 5 comments
Closed

different errors with augmenting survival models #209

topepo opened this issue Nov 9, 2023 · 5 comments
Labels
bug an unexpected problem or unintended behavior

Comments

@topepo
Copy link
Member

topepo commented Nov 9, 2023

Looking at unit tests for using augment() on workflows for survival models and testing that it properly fails when eval_time is unspecified (see #200). We should get this error:

! The eval_time argument is missing, with no default.

When using the glmnet model, a glmnet-related error is triggered instead of the one for an improper argument call. This doesn't happen when just parsnip is used, so it is most likely a workflows issue.

library(tidymodels)
library(censored)
#> Loading required package: survival

set.seed(1)
sim_dat <- prodlim::SimSurv(500) %>%
  mutate(event_time = Surv(time, event)) %>%
  select(event_time, X1, X2)

workflow() %>%
  add_model(proportional_hazards()) %>%
  add_formula(event_time ~ .) %>%
  fit(data = sim_dat) %>% 
  augment(new_data = sim_dat)
#> Error in `augment()`:
#> ! The `eval_time` argument is missing, with no default.
#> Backtrace:
#>     ▆
#>  1. ├─... %>% augment(new_data = sim_dat)
#>  2. ├─generics::augment(., new_data = sim_dat)
#>  3. └─workflows:::augment.workflow(., new_data = sim_dat)
#>  4.   ├─generics::augment(...)
#>  5.   └─parsnip:::augment.model_fit(fit, new_data_forged, eval_time = eval_time, ...)
#>  6.     └─parsnip:::augment_censored(x, new_data, eval_time = eval_time)
#>  7.       └─rlang::abort(...)

workflow() %>%
  add_model(proportional_hazards(penalty = 0.001) %>% set_engine("glmnet")) %>%
  add_formula(event_time ~ .) %>%
  fit(data = sim_dat) %>% 
  augment(new_data = sim_dat)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting 
#> a method for function 'as.matrix': Cholmod error 'X and/or Y have wrong 
#> dimensions' at file ../MatrixOps/cholmod_sdmult.c, line 88
# Same error with parnsip

proportional_hazards() %>%
  fit(event_time ~ ., data = sim_dat) %>% 
  augment(new_data = sim_dat)
#> Error in `augment()`:
#> ! The `eval_time` argument is missing, with no default.
#> Backtrace:
#>     ▆
#>  1. ├─... %>% augment(new_data = sim_dat)
#>  2. ├─generics::augment(., new_data = sim_dat)
#>  3. └─parsnip:::augment.model_fit(., new_data = sim_dat)
#>  4.   └─parsnip:::augment_censored(x, new_data, eval_time = eval_time)
#>  5.     └─rlang::abort(...)

proportional_hazards(penalty = 0.001) %>% 
  set_engine("glmnet") %>% 
  fit(event_time ~ ., data = sim_dat) %>% 
  augment(new_data = sim_dat)
#> Error in `augment()`:
#> ! The `eval_time` argument is missing, with no default.
#> Backtrace:
#>     ▆
#>  1. ├─... %>% augment(new_data = sim_dat)
#>  2. ├─generics::augment(., new_data = sim_dat)
#>  3. └─parsnip:::augment.model_fit(., new_data = sim_dat)
#>  4.   └─parsnip:::augment_censored(x, new_data, eval_time = eval_time)
#>  5.     └─rlang::abort(...)

Created on 2023-11-09 with reprex v2.0.2

@topepo topepo added the bug an unexpected problem or unintended behavior label Nov 9, 2023
@topepo
Copy link
Member Author

topepo commented Nov 9, 2023

The error occurs in parsnip:::augment_censored(). That calls

predict(x, new_data = new_data, type = "time") # 'x' is a parsnip model

and that fails when new_data is created by the workflow. For example, using the test data, the workflow's new_data looks like:

Browse[2]> new_data
# A tibble: 500 × 4
   `(Intercept)`    X1     X2 event_time
           <dbl> <dbl>  <dbl>     <Surv>
 1             1     1 -0.626  3.072882 
 2             1     1  0.184  1.010515 
 3             1     0 -0.836  5.739025 
 4             1     1  1.60   2.483024 
 5             1     0  0.330 10.896000 
 6             1     0 -0.820  9.783280+
 7             1     1  0.487  3.489154 
 8             1     1  0.738  6.022507+
 9             1     1  0.576  3.636429 
10             1     1 -0.305  6.119918 
# ℹ 490 more rows

For without a workflow, it is:

Browse[2]> new_data
# A tibble: 500 × 3
      X1     X2 event_time
   <dbl>  <dbl>     <Surv>
 1     1 -0.626  3.072882 
 2     1  0.184  1.010515 
 3     0 -0.836  5.739025 
 4     1  1.60   2.483024 
 5     0  0.330 10.896000 
 6     0 -0.820  9.783280+
 7     1  0.487  3.489154 
 8     1  0.738  6.022507+
 9     1  0.576  3.636429 
10     1 -0.305  6.119918 
# ℹ 490 more rows

Perhaps the addition of the intercept column causes the failure (although the error message does not suggest that).

@hfrick
Copy link
Member

hfrick commented Nov 13, 2023

@topepo which versions did you use here? I can't reproduce with the current dev versions of parsnip, censored, and workflows.

library(tidymodels)
library(censored)
#> Loading required package: survival

set.seed(1)
sim_dat <- prodlim::SimSurv(500) %>%
  mutate(event_time = Surv(time, event)) %>%
  select(event_time, X1, X2)

workflow() %>%
  add_model(proportional_hazards()) %>%
  add_formula(event_time ~ .) %>%
  fit(data = sim_dat) %>% 
  augment(new_data = sim_dat)
#> Error in `augment()`:
#> ! The `eval_time` argument is missing, with no default.
#> Backtrace:
#>     ▆
#>  1. ├─... %>% augment(new_data = sim_dat)
#>  2. ├─generics::augment(., new_data = sim_dat)
#>  3. └─workflows:::augment.workflow(., new_data = sim_dat)
#>  4.   ├─generics::augment(...)
#>  5.   └─parsnip:::augment.model_fit(fit, new_data_forged, eval_time = eval_time, ...)
#>  6.     └─parsnip:::augment_censored(x, new_data, eval_time = eval_time)
#>  7.       └─rlang::abort(...)

workflow() %>%
  add_model(proportional_hazards(penalty = 0.001) %>% set_engine("glmnet")) %>%
  add_formula(event_time ~ .) %>%
  fit(data = sim_dat) %>% 
  augment(new_data = sim_dat)
#> Error in `augment()`:
#> ! The `eval_time` argument is missing, with no default.
#> Backtrace:
#>     ▆
#>  1. ├─... %>% augment(new_data = sim_dat)
#>  2. ├─generics::augment(., new_data = sim_dat)
#>  3. └─workflows:::augment.workflow(., new_data = sim_dat)
#>  4.   ├─generics::augment(...)
#>  5.   └─parsnip:::augment.model_fit(fit, new_data_forged, eval_time = eval_time, ...)
#>  6.     └─parsnip:::augment_censored(x, new_data, eval_time = eval_time)
#>  7.       └─rlang::abort(...)

Created on 2023-11-13 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.1 (2023-06-16)
#>  os       macOS Sonoma 14.1
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/London
#>  date     2023-11-13
#>  pandoc   3.1.1 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package      * version    date (UTC) lib source
#>  backports      1.4.1      2021-12-13 [1] CRAN (R 4.3.0)
#>  broom        * 1.0.5      2023-06-09 [1] CRAN (R 4.3.0)
#>  censored     * 0.2.0.9000 2023-07-18 [1] Github (tidymodels/censored@f9eccb6)
#>  class          7.3-22     2023-05-03 [2] CRAN (R 4.3.1)
#>  cli            3.6.1.9000 2023-09-26 [1] Github (r-lib/cli@641fe8c)
#>  codetools      0.2-19     2023-02-01 [2] CRAN (R 4.3.1)
#>  colorspace     2.1-0      2023-01-23 [1] CRAN (R 4.3.0)
#>  data.table     1.14.8     2023-02-17 [1] CRAN (R 4.3.1)
#>  dials        * 1.2.0      2023-04-03 [1] CRAN (R 4.3.1)
#>  DiceDesign     1.9        2021-02-13 [1] CRAN (R 4.3.0)
#>  digest         0.6.33     2023-07-07 [1] CRAN (R 4.3.1)
#>  dplyr        * 1.1.3      2023-09-03 [1] CRAN (R 4.3.0)
#>  evaluate       0.23       2023-11-01 [1] CRAN (R 4.3.1)
#>  fansi          1.0.5      2023-10-08 [1] CRAN (R 4.3.1)
#>  fastmap        1.1.1      2023-02-24 [1] CRAN (R 4.3.1)
#>  foreach        1.5.2      2022-02-02 [1] CRAN (R 4.3.0)
#>  fs             1.6.3      2023-07-20 [1] CRAN (R 4.3.1)
#>  furrr          0.3.1      2022-08-15 [1] CRAN (R 4.3.1)
#>  future         1.33.0     2023-07-01 [1] CRAN (R 4.3.1)
#>  future.apply   1.11.0     2023-05-21 [1] CRAN (R 4.3.1)
#>  generics       0.1.3      2022-07-05 [1] CRAN (R 4.3.0)
#>  ggplot2      * 3.4.4      2023-10-12 [1] CRAN (R 4.3.1)
#>  glmnet         4.1-8      2023-08-22 [1] CRAN (R 4.3.0)
#>  globals        0.16.2     2022-11-21 [1] CRAN (R 4.3.0)
#>  glue           1.6.2      2022-02-24 [1] CRAN (R 4.3.0)
#>  gower          1.0.1      2022-12-22 [1] CRAN (R 4.3.1)
#>  GPfit          1.0-8      2019-02-08 [1] CRAN (R 4.3.0)
#>  gtable         0.3.4      2023-08-21 [1] CRAN (R 4.3.0)
#>  hardhat        1.3.0      2023-03-30 [1] CRAN (R 4.3.0)
#>  htmltools      0.5.7      2023-11-03 [1] CRAN (R 4.3.1)
#>  infer        * 1.0.5      2023-09-06 [1] CRAN (R 4.3.0)
#>  ipred          0.9-14     2023-03-09 [1] CRAN (R 4.3.1)
#>  iterators      1.0.14     2022-02-05 [1] CRAN (R 4.3.0)
#>  knitr          1.45       2023-10-30 [1] CRAN (R 4.3.1)
#>  lattice        0.22-5     2023-10-24 [1] CRAN (R 4.3.1)
#>  lava           1.7.3      2023-11-04 [1] CRAN (R 4.3.1)
#>  lhs            1.1.6      2022-12-17 [1] CRAN (R 4.3.1)
#>  lifecycle      1.0.4      2023-11-07 [1] CRAN (R 4.3.1)
#>  listenv        0.9.0      2022-12-16 [1] CRAN (R 4.3.1)
#>  lubridate      1.9.3      2023-09-27 [1] CRAN (R 4.3.1)
#>  magrittr       2.0.3      2022-03-30 [1] CRAN (R 4.3.0)
#>  MASS           7.3-60     2023-05-04 [2] CRAN (R 4.3.1)
#>  Matrix         1.6-1.1    2023-09-18 [1] CRAN (R 4.3.1)
#>  modeldata    * 1.2.0      2023-08-09 [1] CRAN (R 4.3.0)
#>  modelenv       0.1.1      2023-03-08 [1] CRAN (R 4.3.1)
#>  munsell        0.5.0      2018-06-12 [1] CRAN (R 4.3.0)
#>  nnet           7.3-19     2023-05-03 [2] CRAN (R 4.3.1)
#>  parallelly     1.36.0     2023-05-26 [1] CRAN (R 4.3.1)
#>  parsnip      * 1.1.1.9001 2023-11-10 [1] Github (tidymodels/parsnip@86f8a4e)
#>  pillar         1.9.0.9003 2023-11-10 [1] Github (r-lib/pillar@92fdbba)
#>  pkgconfig      2.0.3      2019-09-22 [1] CRAN (R 4.3.0)
#>  prodlim        2023.08.28 2023-08-28 [1] CRAN (R 4.3.0)
#>  purrr        * 1.0.2      2023-08-10 [1] CRAN (R 4.3.0)
#>  R.cache        0.15.0     2021-04-30 [1] CRAN (R 4.3.0)
#>  R.methodsS3    1.8.1      2020-08-26 [1] CRAN (R 4.3.0)
#>  R.oo           1.24.0     2020-08-26 [1] CRAN (R 4.3.0)
#>  R.utils        2.11.0     2021-09-26 [1] CRAN (R 4.3.0)
#>  R6             2.5.1      2021-08-19 [1] CRAN (R 4.3.0)
#>  Rcpp           1.0.11     2023-07-06 [1] CRAN (R 4.3.1)
#>  recipes      * 1.0.8.9000 2023-11-10 [1] Github (tidymodels/recipes@746b473)
#>  reprex         2.0.2      2022-08-17 [1] CRAN (R 4.3.0)
#>  rlang          1.1.2      2023-11-04 [1] CRAN (R 4.3.1)
#>  rmarkdown      2.25       2023-09-18 [1] CRAN (R 4.3.1)
#>  rpart          4.1.21     2023-10-09 [1] CRAN (R 4.3.1)
#>  rsample      * 1.2.0.9000 2023-11-01 [1] Github (tidymodels/rsample@be593b9)
#>  rstudioapi     0.15.0     2023-07-07 [1] CRAN (R 4.3.1)
#>  scales       * 1.2.1      2022-08-20 [1] CRAN (R 4.3.0)
#>  sessioninfo    1.2.2      2021-12-06 [1] CRAN (R 4.3.0)
#>  shape          1.4.6      2021-05-19 [1] CRAN (R 4.3.1)
#>  styler         1.7.0      2022-03-13 [1] CRAN (R 4.3.0)
#>  survival     * 3.5-7      2023-08-14 [1] CRAN (R 4.3.0)
#>  tibble       * 3.2.1      2023-03-20 [1] CRAN (R 4.3.0)
#>  tidymodels   * 1.1.1      2023-08-24 [1] CRAN (R 4.3.1)
#>  tidyr        * 1.3.0      2023-01-24 [1] CRAN (R 4.3.0)
#>  tidyselect     1.2.0      2022-10-10 [1] CRAN (R 4.3.0)
#>  timechange     0.2.0      2023-01-11 [1] CRAN (R 4.3.1)
#>  timeDate       4022.108   2023-01-07 [1] CRAN (R 4.3.1)
#>  tune         * 1.1.2.9000 2023-11-01 [1] Github (tidymodels/tune@3f82cb2)
#>  utf8           1.2.4      2023-10-22 [1] CRAN (R 4.3.1)
#>  vctrs          0.6.4      2023-10-12 [1] CRAN (R 4.3.1)
#>  withr          2.5.2      2023-10-30 [1] CRAN (R 4.3.1)
#>  workflows    * 1.1.3.9000 2023-11-13 [1] Github (tidymodels/workflows@1413997)
#>  workflowsets * 1.0.1      2023-04-06 [1] CRAN (R 4.3.1)
#>  xfun           0.41       2023-11-01 [1] CRAN (R 4.3.1)
#>  yaml           2.3.7      2023-01-23 [1] CRAN (R 4.3.0)
#>  yardstick    * 1.2.0.9001 2023-11-01 [1] Github (tidymodels/yardstick@690e738)
#> 
#>  [1] /Users/hannah/Library/R/arm64/4.3/library
#>  [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

@hfrick
Copy link
Member

hfrick commented Nov 13, 2023

Surfaced in tidymodels/extratests#120. The corresponding test in extratests should be checked/updated as part of closing this.

@hfrick
Copy link
Member

hfrick commented Jan 16, 2024

Still can't reproduce the difference between glmnet and survival as the engines so closing this now

library(tidymodels)
library(censored)
#> Loading required package: survival

set.seed(1)
sim_dat <- prodlim::SimSurv(500) %>%
  mutate(event_time = Surv(time, event)) %>%
  select(event_time, X1, X2)

workflow() %>%
  add_model(proportional_hazards()) %>%
  add_formula(event_time ~ .) %>%
  fit(data = sim_dat) %>% 
  augment(new_data = sim_dat)
#> Error in `augment()`:
#> ! The `eval_time` argument is missing, with no default.

workflow() %>%
  add_model(proportional_hazards(penalty = 0.001) %>% 
              set_engine("glmnet")) %>%
  add_formula(event_time ~ .) %>%
  fit(data = sim_dat) %>% 
  augment(new_data = sim_dat)
#> Error in `augment()`:
#> ! The `eval_time` argument is missing, with no default.

Created on 2024-01-16 with reprex v2.0.2

@hfrick hfrick closed this as completed Jan 16, 2024
Copy link

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

@github-actions github-actions bot locked and limited conversation to collaborators Jan 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants