Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: combining/crossing multiple resampling methods #455

Open
PathosEthosLogos opened this issue Sep 13, 2023 · 1 comment
Open
Labels
feature a feature request or enhancement

Comments

@PathosEthosLogos
Copy link

Here are few scenarios I would like to describe in this feature request:

  1. Combining/crossing multiple resampling methods (e.g. spatial_nndm_cv() crossed with sliding_period() resamples)

  2. Determining/benchmarking best model similar to rank_results() for different resampling methods (e.g. group_vfold_cv(cities) vs. spatial_clustering_cv(coords = (x, y))

  3. Ability to input a list of multiple resampling methods into model specifications e.g. workflow_set(resamples = list(*)), similar to passing workflow_set(models = list(*), preproc = list(*)) (or ability to pass a resamples list akin to workflow_map(resamples = list(*)))

Here are some example questions -- would spatial_clustering_cv(coords = (x, y) be better or would group_vfold_cv(cities) be better for model tuning? How much would granularity matter in model tuning e.g. group_vfold_cv(cities) vs. group_vfold_cv(municipalities)? How much is the model effected by resampling efficiency (e.g. vs. bootstraps)?

An example decision making point could be that if there is small accuracy loss difference but big run time (or electric costs) difference between cities vs. province, then province would be selected. The results could be shown from the output of rank_results(), with a column showing which resampling was used.

Then for utility, saving this model with a particular resampling method so this model specification can be reused right away again would be nice (would this be tidypredict_fit()? augment()? bake()?).

@PathosEthosLogos PathosEthosLogos changed the title Combining/determining best resampling method akin to workflowsets (e.g. spatialresample + sliding_period) Feature request: combining/crossing resampling methods Sep 13, 2023
@PathosEthosLogos PathosEthosLogos changed the title Feature request: combining/crossing resampling methods Feature request: combining/crossing multiple resampling methods Sep 13, 2023
@hfrick
Copy link
Member

hfrick commented Nov 1, 2023

Thanks for the issue! I'll have to think about this one a little.

@hfrick hfrick added the feature a feature request or enhancement label Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants