REvolve: Reward Evolution with Large Language Models using Human Feedback

Official code release of our ICLR 2025 paper.

Setup

# clone the repository 
git clone https://github.com/RishiHazra/Revolve.git
cd Revolve
conda create -n "revolve" python=3.10
conda activate revolve
pip install -e .

Run

export ROOT_PATH='Revolve'
export OPENAI_API_KEY='<your openai key>'
python main.py \ 
        evolution.num_generations=5 \  # number of generations
        evolution.individuals_per_generation=15 \  # number of individuals in each generation
        database.num_islands=5 \  # number of groups/populations to start with
        database.max_island_size=8 \  # max number of samples in each group/population
        data_paths.run=10 \  # run_id
        environment.name="HumanoidEnv"  # Choose between "HumanoidEnv" or "AdroitHandDoorEnv"

Note, we will soon release the AirSim environment setup script.

For AirSim, follow the instruction on this link https://microsoft.github.io/AirSim/build_linux/

export AIRSIM_PATH='AirSim'
export AIRSIMNH_PATH='AirSimNH/AirSimNH/LinuxNoEditor/AirSimNH.sh'

Other Utilities

The prompts are listed in prompts folder.
Elo scoring in human_feedback folder

Citation

To cite our paper:

@misc{hazra2024revolverewardevolutionlarge,
      title={REvolve: Reward Evolution with Large Language Models using Human Feedback}, 
      author={Rishi Hazra and Alkis Sygkounas and Andreas Persson and Amy Loutfi and Pedro Zuidberg Dos Martires},
      year={2024},
      eprint={2406.01309},
      archivePrefix={arXiv},
      primaryClass={cs.NE},
      url={https://arxiv.org/abs/2406.01309}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.idea		.idea
cfg		cfg
evolutionary_utils		evolutionary_utils
human_feedback		human_feedback
prompts		prompts
revolve.egg-info		revolve.egg-info
rl_agent		rl_agent
README.md		README.md
main.py		main.py
modules.py		modules.py
pyproject.toml		pyproject.toml
revolve.gif		revolve.gif
rewards_database.py		rewards_database.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

REvolve: Reward Evolution with Large Language Models using Human Feedback

Setup

Run

Other Utilities

Citation

To cite our paper:

About

Releases 1

Packages

Contributors 2

Languages

RishiHazra/Revolve

Folders and files

Latest commit

History

Repository files navigation

REvolve: Reward Evolution with Large Language Models using Human Feedback

Setup

Run

Other Utilities

Citation

To cite our paper:

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

Packages