Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom probability thresholds for binary classification ? #64

Open
yangxg opened this issue Aug 6, 2020 · 2 comments
Open

Custom probability thresholds for binary classification ? #64

yangxg opened this issue Aug 6, 2020 · 2 comments
Labels
feature a feature request or enhancement

Comments

@yangxg
Copy link

yangxg commented Aug 6, 2020

Dear Authors:

I am aware that there is a plan to add the feature of custom probability threshold for classification in workflow package. But it seems not published yet. I also notice that the probably package is very helpful to find out the approriate threshold value (the link), but I don't know how to integrate it into the workflow.

As this feature is important to one of my projects, so I am wondering is there a alternative approach to archieve it in the whole tidymodels workflow?

Anyway, I am also look forward to the one-line solution in the future! Thanks again for the great work!

Xiaoguang

@juliasilge juliasilge added the feature a feature request or enhancement label Aug 28, 2020
@Teett
Copy link

Teett commented Feb 13, 2021

Hi!!

I was wondering if there is any update in this issue? This is kind of an important feature when dealing with classification problems that require a higher specificity (e.g. for mortality prediction and health risks) and at the moment I feel like the only way is to do it in tidymodels is something of this sort:

collect_predictions(final_model) %>%
  mutate(corrected_class = as_factor(case_when(.pred_alive > 0.75 ~ "alive",
                                               TRUE ~ "dead"))) %>% # Manual threshhold
  conf_mat(truth = vital_state, estimate = corrected_class) 

I think this workaround is accurate for a confusion matrix? But we still need to be able to calculate the new metrics and I can't find a way to do it in a simple way using yardstick.

Thank you!!

@StevenWallaert
Copy link

Hi,
Would there be any update on this?

I am aware of {probably} and how to look for an optimal thershold after training.
However, I would like to tune the threshold within a workflow and treat the threshold as any other hyperparameter.

Many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

4 participants