OpenThaiGPT - Thai Exams Eval

Kobkrit Viriyayudhakorn ([email protected])

Usage: python evaluate.py <model_name> [model_path/api_key] for evaluate the model with all exams in exams folder with the given model name.

Available benchmark models:

openthaigpt/openthaigpt-1.0.0-beta-7b-chat-ckpt-hf
openthaigpt/openthaigpt-1.0.0-beta-13b-chat-hf
openthaigpt/openthaigpt-1.0.0-7b-chat
openthaigpt/openthaigpt-1.0.0-13b-chat
openthaigpt/openthaigpt-1.0.0-70b-chat
sail/Sailor-7B-Chat
pythainlp/wangchanglm-7.5B-sft-enth
aisingapore/sea-lion-7b-instruct
SeaLLMs/SeaLLM-7B-v1
SeaLLMs/SeaLLM-7B-v2
claude-3-opus-20240229
claude-3-sonnet-20240229
claude-3-haiku-20240307
typhoon-instruct
gpt-3.5-turbo
gpt-4
gemini-pro-1.5

Available benchmark datasets:

A-Level
TGAT
TPAT1
Thai Investment Consultant
Facebook Belebele Thai
xcopa_th_200
xnli2.0_th_200
Thai ONET M3
Thai ONET M6

Exams Details

https://docs.google.com/spreadsheets/d/1ZtP5Jkx0IvCWNPQhMKitZszGnLKqvEDEf0OKdmQiXjA/edit#gid=1181424412

Creating a Conda environment

conda create --name otg-exam-eval python=3.11

Activating the Conda environment

conda activate otg-exam-eval

Installing the required packages

pip install -r requirements.txt

Run Evaluation

./run.sh

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
exams		exams
models		models
outputs		outputs
short_exams		short_exams
LICENSE		LICENSE
README.md		README.md
evaluate.py		evaluate.py
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenThaiGPT - Thai Exams Eval

Exams Details

Creating a Conda environment

Activating the Conda environment

Installing the required packages

Run Evaluation

About

Releases

Packages

Contributors 2

Languages

License

OpenThaiGPT/openthaigpt_eval

Folders and files

Latest commit

History

Repository files navigation

OpenThaiGPT - Thai Exams Eval

Exams Details

Creating a Conda environment

Activating the Conda environment

Installing the required packages

Run Evaluation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages