William Liang, Sam Wang, Hung-Ju Wang,
Osbert Bastani, Dinesh Jayaraman†, Yecheng Jason Ma†
University of Pennsylvania
eurekaverse_trailer.mp4
eurekaverse_method.mp4
Recent work has demonstrated that a promising strategy for teaching robots a wide range of complex skills is by training them on a curriculum of progressively more challenging environments. However, developing an effective curriculum of environment distributions currently requires significant expertise, which must be repeated for every new domain. Our key insight is that environments are often naturally represented as code. Thus, we probe whether effective environment curriculum design can be achieved and automated via code generation by large language models (LLM). In this paper, we introduce Eurekaverse, an unsupervised environment design algorithm that uses LLMs to sample progressively more challenging, diverse, and learnable environments for skill training. We validate Eurekaverse's effectiveness in the domain of quadrupedal parkour learning, in which a quadruped robot must traverse through a variety of obstacle courses. The automatic curriculum designed by Eurekaverse enables gradual learning of complex parkour skills in simulation and can successfully transfer to the real-world, outperforming manual training courses designed by humans.
The following instructions will install everything under one Conda environment. We have tested on Ubuntu 20.04.
-
Create a new Conda environment with:
conda create -n eurekaverse python=3.8 conda activate eurekaverse
-
Install IsaacGym:
- Download and install IsaacGym from NVIDIA: https://developer.nvidia.com/isaac-gym.
- Unzip the file:
tar -xf IsaacGym_Preview_4_Package.tar.gz
- Install the python package:
cd isaacgym/python pip install -e .
-
Install Eurekaverse:
cd eurekaverse pip install -e .
-
Install
legged_gym
andrsl_rl
, base code used for quadruped reinforcement learning in simulation, extended from Extreme Parkour:pip install -e extreme-parkour/legged_gym pip install -e extreme-parkour/rsl_rl
-
First, set your OpenAI API key via:
export OPENAI_API_KEY=<YOUR_KEY>
-
To ensure that necessary libraries are being detected properly, update
LD_LIBRARY_PATH
with:export LD_LIBRARY_PATH=~/anaconda3/envs/eurekaverse/lib:$LD_LIBRARY_PATH
-
Now, we are ready to begin environment curriculum generation. Review the configuration in
eurekaverse/eurekaverse/config/config.yaml
. The current parameters were used for our experiments. To run generation:cd eurekaverse python run_eurekaverse.py
The outputs will be saved in
eurekaverse/outputs/run_eurekaverse/<RUN_ID>
. -
Afterwards, distill the final policy via:
python distill_eurekaverse.py <YOUR_RUN_ID>
Similarly, the outputs will be saved in
eurekaverse/outputs/distill_eurekaverse/<RUN_ID>
.
Our deployment infrastructure on the Unitree Go1 uses LCM for low-level commands and Docker to run the policy. Note that our Docker is only tested on the Jetson Xavier NX on the Go1. Our setup is loosely based on LucidSim and Walk These Ways.
-
Connect a Realsense D435 to the middle USB port on the Go1. 3D print and mount the Realsense using this design from Robot Parkour Learning.
-
Start up the Go1 and connect to it on your machine via Ethernet. Make sure you can ssh onto the NX (192.168.123.15).
-
Put the robot into damping mode with the controller: L2+A, L2+B, L1+L2+START. The robot should be lying on the ground afterwards.
-
Build the Docker image with:
cd go1_deploy/docker docker buildx build --platform linux/arm64 -t go1-deploy:latest . --load
-
Save the image:
docker save go1-deploy -o go1_deploy.tar
-
Copy the Docker and other necessary files over to the Go1:
./send_to_unitree.sh scp go1_deploy.tar go1-nx:/home/unitree/go1_gym/go1_gym_deploy/scripts
-
Connect onto the Go1 NX, then load the Docker:
sudo docker load -i go1_deploy.tar
-
Connect onto the Go1 NX. You should see
eurekaverse
in the home directory (from./send_to_unitree.sh
). -
Start LCM:
cd eurekaverse/go1_deploy/launch ./start_lcm.sh
-
Start and enter the Docker container:
sudo -E ./start_docker.sh ./enter_docker.sh
-
Within the container, run the policy:
python3 deploy_policy.py
-
Monitor the output, and when it's ready to calibrate, press R2. Pressing R2 again will start the policy, and R2 again will stop.
We thank the following open-sourced projects:
- Our simulation runs in IsaacGym.
- Our parkour simulation builds on Extreme Parkour.
- Our deployment infrastructure builds on LucidSim and Walk These Ways.
- Our Realsense mount was released in Robot Parkour Learning.
- The environment structure and training code build on Legged Gym and RSL_RL.
This codebase is released under MIT License.
If you find our work useful, please consider citing us!
@inproceedings{liang2024eurekaverse,
title = {Eurekaverse: Environment Curriculum Generation via Large Language Models},
author = {William Liang and Sam Wang and Hungju Wang and Osbert Bastani and Dinesh Jayaraman and Jason Ma}
year = {2024},
booktitle = {Conference on Robot Learning (CoRL)}
}