Skip to content
This repository has been archived by the owner on Feb 15, 2025. It is now read-only.

bug: mmlu evaluation scores are constant #1172

Open
jalling97 opened this issue Oct 1, 2024 · 0 comments
Open

bug: mmlu evaluation scores are constant #1172

jalling97 opened this issue Oct 1, 2024 · 0 comments
Labels
possible-bug 🐛 Something may not be working

Comments

@jalling97
Copy link
Contributor

Steps to reproduce

  1. Deploy LFAI
  2. Run evaluations against deployed instance
  3. View MMLU results

Expected result

  • Scores to very across each topic category in MMLU

Actual Result

  • All scores are the same for each category (approx 0.69, which is relatively high)

Visual Proof (screenshots, videos, text, etc)

INFO:root:MMLU task scores:
                            Task    Score
0  high_school_european_history  0.69697
1               business_ethics  0.69697

Additional Context

May need to investigate the underlying implementation of mmlu in deepeval

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
possible-bug 🐛 Something may not be working
Projects
None yet
Development

No branches or pull requests

1 participant