Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrimination Threshold plot explanation #1324

Open
robmcd opened this issue Feb 27, 2025 · 0 comments
Open

Discrimination Threshold plot explanation #1324

robmcd opened this issue Feb 27, 2025 · 0 comments

Comments

@robmcd
Copy link

robmcd commented Feb 27, 2025

Describe the issue
Hi, I have been trying to use the discrimination threshold plot as per below

from yellowbrick.classifier import PrecisionRecallCurve, DiscriminationThreshold

# Precision-Recall Plot
pr_curve = PrecisionRecallCurve(model)
pr_curve.fit(X_train, y_train)
pr_curve.score(X_test, y_test)
pr_curve.show()

# Threshold Plot
threshold_plot = DiscriminationThreshold(model)
threshold_plot.fit(X_train, y_train)
threshold_plot.score(X_test, y_test)
threshold_plot.show()

However, I'm confused by the results. Perhaps there's a gap in my understanding.

My precision recall plot shows values of precision ranging from 0.2 to 0.6, with a "jump" to 1 at zero recall..

My discrimination threshold plot shows "scores" for precision ranging from 0.5 up to 1 (talking about the line, not the band). The bands are very narrow and only widen from threshold 0.9+

On the precision recall chart a recall of 0.2 relates to a precision of ~0.4. However on the discrimination threshold chart a recall of 0.2 relates to a precision score of ~0.9?

Why does the recall vs. precision figures I see on the PR chart not match the precision vs recall scores on the discrimination threshold chart?

Apologies for my ignorance. Could you help me understand?

Also, I don't understand how the "score" for precision can start at 0.5 (for a discrimination threshold value of 0) given that in the precision-recall curve we saw values for precision as low as 0.2 (presumably for some low probability threshold). Is the varying discrimination threshold not the same as the varying probabilities used to generate the precision recall curve? Why is there nowhere on the discrimination threshold plot showing a precision of 0.2?

@DistrictDataLabs/team-oz-maintainers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant