Official code release of our ICLR 2025 paper.
# clone the repository
git clone https://github.com/RishiHazra/Revolve.git
cd Revolve
conda create -n "revolve" python=3.10
conda activate revolve
pip install -e .
export ROOT_PATH='Revolve'
export OPENAI_API_KEY='<your openai key>'
python main.py \
evolution.num_generations=5 \ # number of generations
evolution.individuals_per_generation=15 \ # number of individuals in each generation
database.num_islands=5 \ # number of groups/populations to start with
database.max_island_size=8 \ # max number of samples in each group/population
data_paths.run=10 \ # run_id
environment.name="HumanoidEnv" # Choose between "HumanoidEnv" or "AdroitHandDoorEnv"
Note, we will soon release the AirSim environment setup script.
For AirSim, follow the instruction on this link https://microsoft.github.io/AirSim/build_linux/
export AIRSIM_PATH='AirSim'
export AIRSIMNH_PATH='AirSimNH/AirSimNH/LinuxNoEditor/AirSimNH.sh'
- The prompts are listed in
prompts
folder. - Elo scoring in
human_feedback
folder
@misc{hazra2024revolverewardevolutionlarge,
title={REvolve: Reward Evolution with Large Language Models using Human Feedback},
author={Rishi Hazra and Alkis Sygkounas and Andreas Persson and Amy Loutfi and Pedro Zuidberg Dos Martires},
year={2024},
eprint={2406.01309},
archivePrefix={arXiv},
primaryClass={cs.NE},
url={https://arxiv.org/abs/2406.01309},
}