[BUG] Ranking Evaluation Metrics Exceed 1 with "by_threshold" Relevancy Method #2154

mnhqut · 2024-08-23T15:01:27Z

Description

Hello!

I encountered an issue while evaluating the BPR (Bayesian Personalized Ranking) model with basically the same code provided in the example on a different dataset. Specifically, when using the "by_threshold" relevancy method with ranking metrics, the computed values for precision@k, ndcg@k, and map@k exceed 1, which seems incorrect. This issue does not occur when switching the relevancy method to "top_k."

How do we replicate the issue?

I use the following parameter for BPR (all using the default seed):

bpr = cornac.models.BPR(
    k=200,
    max_iter=100,
    learning_rate=0.01,
    lambda_reg=0.001,
    verbose=True 
)

Using these evaluation

TOP_K = 10
threshold =50
eval_map = map_at_k(test, all_predictions, col_prediction="prediction",
                    relevancy_method='by_threshold', threshold=threshold, k=TOP_K)
eval_ndcg = ndcg_at_k(test, all_predictions, col_prediction="prediction",
                      relevancy_method='by_threshold', threshold=threshold, k=TOP_K)
eval_precision = precision_at_k(
    test, all_predictions, col_prediction="prediction",
    relevancy_method='by_threshold', threshold=threshold, k=TOP_K)

Here is the dataset I test on: https://github.com/mnhqut/rec_sys-dataset/blob/main/data.csv

My result:
MAP: 1.417529
NDCG: 1.359902
Precision@K: 2.256466

Willingness to contribute

Yes, I can contribute for this issue independently.
[x ] Yes, I can contribute for this issue with guidance from Recommenders community.
No, I cannot contribute at this time.

The text was updated successfully, but these errors were encountered:

mnhqut · 2024-08-24T03:23:47Z

I forgot to mention the way I spitted training and testing data was:

train, test = python_stratified_split(df, ratio=0.75)

miguelgfierro · 2024-08-27T10:44:18Z

We need to review this @SimonYansenZhao @anargyri @daviddavo @loomlike and even @yueguoguo

daviddavo · 2024-08-29T14:51:44Z

How do you get the all_predictions? Can you provide the full code to reproduce the issue? Or are you just using the deep dive notebook?

daviddavo · 2024-08-29T15:09:29Z

The problem seems to be that when you use by_threshold, the k at some equation terms remains being top_k

For example, in precision@k:

recommenders/recommenders/evaluation/python_evaluation.py

Line 496 in 4f86e47

return (df_hit_count["hit"] / k).sum() / n_users

It divides by 10 (default k value), instead of by 50 (the specified by_threshold value).

The other metrics have similar problems.

Perhaps this is what by_threshold is intended to do. Is it a way of changing how many items you want, even though you are calculating by k?? I don't really understand how by_threshold should work so I don't really know if this is a bug or intended behaviour.

[ASK] Confusion between top_k and by_threshold relevancy_method in python_evaluation.py #2140

I can solve the bug by just using threshold instead of k when necessary, but then by_threshold and top_k would be exactly the same.

Btw, here is a notebook that replicates the issue in Google Colab

mnhqut added the bug Something isn't working label Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Ranking Evaluation Metrics Exceed 1 with "by_threshold" Relevancy Method #2154

[BUG] Ranking Evaluation Metrics Exceed 1 with "by_threshold" Relevancy Method #2154

mnhqut commented Aug 23, 2024

mnhqut commented Aug 24, 2024

miguelgfierro commented Aug 27, 2024 •

edited

Loading

daviddavo commented Aug 29, 2024

daviddavo commented Aug 29, 2024

[BUG] Ranking Evaluation Metrics Exceed 1 with "by_threshold" Relevancy Method #2154

[BUG] Ranking Evaluation Metrics Exceed 1 with "by_threshold" Relevancy Method #2154

Comments

mnhqut commented Aug 23, 2024

Description

How do we replicate the issue?

Willingness to contribute

mnhqut commented Aug 24, 2024

miguelgfierro commented Aug 27, 2024 • edited Loading

daviddavo commented Aug 29, 2024

daviddavo commented Aug 29, 2024

miguelgfierro commented Aug 27, 2024 •

edited

Loading