Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on how metrics are calculated #19

Open
elephaint opened this issue Mar 3, 2025 · 3 comments
Open

Clarification on how metrics are calculated #19

elephaint opened this issue Mar 3, 2025 · 3 comments

Comments

@elephaint
Copy link

If I run the following:

import pandas as pd
df_timesfm = pd.read_csv("results/timesfm_2_0_500m/all_results.csv")
print(f"TimesFM MASE: {df_timesfm['eval_metrics/MASE[0.5]'].mean():.2f} \n"
      f"TimesFM CRPS: {df_timesfm['eval_metrics/mean_weighted_sum_quantile_loss'].mean():.2f}")

the output is:

TimesFM MASE: 1.71 
TimesFM CRPS: 0.25

whereas the leaderboard here states:

Image

Can you explain / detail how the leaderboard is calculated or point me to where it is explained?

@cuthalionn
Copy link
Contributor

cuthalionn commented Mar 3, 2025

Hi @elephaint,

The results for each model are standardized by seasonal naive results and then we take the geometric mean across datasets for each model. You can find the details in the source code for the leaderboard ](https://huggingface.co/spaces/Salesforce/GIFT-Eval/tree/main/src)

@elephaint
Copy link
Author

Ah, thanks, completely overlooked that, scrolling down further it becomes obvious when seeing SeasonalNaive at 1.0. Thanks!

Tiny thing: I noticed that in your Naive notebook, you actually use the SeasonalNaive. So I assume that what in this repo notebook called naive is actually SeasonalNaive on the leaderboard? (because the notebook naive will produce a table with name naive , but the results are SeasonalNaive as far as I can tell.

@cuthalionn
Copy link
Contributor

No worries, I am glad it clarifies the confusion!

In the notebook we actually use Naive because the predictor is set to NaivePredictor. We use seasonal naive as the fallback model. But the same notebook can be easily adapted for seasonal naive too. One would just need to create a SeasonalNaivePredictor(StatsForecastPredictor) predictor class and set the model type accordingly.

So the terms Naive and Seasonal Naive on the leaderboard and in the repository represent their respective models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants