- Set up EvalPlus: HumanEval(+) and MBPP(+), as well as their evaluation utils should be already installed with environment.yml
- Set up CRUXEval:
git clone https://github.com/facebookresearch/cruxeval.git
Update the $CRUXEVAL_HOME
to be the absolute path of the cloned repository in this script
- Set up LiveCodeBench:
# Clone LiveCodeBench
git clone https://github.com/Robin-Y-Ding/LiveCodeBench.git; # forked version with SemCoder customization
# Set up environment
cd LiveCodeBench;
conda create -n livecodebench Python=3.10;
conda activate livecodebench;
pip install poetry;
poetry install --with with-gpu;
- To evaluate SemCoder on EvalPlus, run
cd SemCoder;
conda activate semcoder;
# make sure you are under <path>/SemCoder/
export PYTHONPATH=$(pwd);
bash scripts/eval/eval_evalplus.sh
- To evaluate SemCoder on LiveCodeBench, run
cd LiveCodeBench;
conda activate livecodebench;
# make sure you are under <path>/LiveCodeBench/
bash scripts/eval/eval_codegen.sh
- To evaluate SemCoder on CRUXEval, you need to firstly clone their official release:
cd SemCoder;
conda activate semcoder;
# make sure you are under <path>/SemCoder/
export PYTHONPATH=$(pwd);
bash scripts/eval/eval_cruxeval.sh
- To evaluate SemCoder on LiveCodeBench, run
cd LiveCodeBench;
conda activate livecodebench;
# make sure you are under <path>/LiveCodeBench/
bash scripts/eval/eval_codeexe.sh