Model-Based Active Exploration (MAX)

Code for reproducing experiments in Model-Based Active Exploration, ICML 2019

Written in PyTorch v1.0.

Code relies on sacred for managing experiments and hyper-parameters.

Overview:

envs/: contains the environments used.
main.py: contains the main algorithm and baselines through modes.
models.py: a fast parallel implementation of an ensemble of models which can are trained with negative log-likelihood loss.
utilities.py: contains the all the utilities (exploration objectives) used in the paper.
imagination.py: contains code that constructs a virtual MDP using the model ensemble.
sac.py: contains a simple Soft Actor-Critic implementation.
sacred_fetcher.py: script to download experiment artifacts stored in MongoDB.

Installation

Install required dependencies:

sudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3 patchelf

Create conda environment with required dependencies:
```
conda env create -f conda_env.yml
```
Download and setup MuJoCo binaries. The project uses mujoco and mujoco_py version 1.50.
```
mkdir ~/.mujoco/
cd .mujoco/
wget -c https://www.roboti.us/download/mjpro150_linux.zip
unzip mjpro150_linux.zip
rm mjpro150_linux.zip
```
Obtain MuJoCo license key and place it .mujoco/ directory created above with filename mjkey.txt.

Append the following to ~/.bashrc:

# MuJoCo
export LD_LIBRARY_PATH=:/home/<USER>/.mujoco/mjpro150/bin

if [ -f /usr/lib/x86_64-linux-gnu/libGLEW.so ]; then    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/<USER>/.mujoco/mjpro150/bin:/usr/lib/nvidia-390
    export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libGLEW.so
    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia-375
fi

Quick test of MuJoCo installation

>>> import gym
>>> gym.make('HalfCheetah-v2')

Commands

Execute the commands listed below from the code directory to reproduce the results.

Half Cheetah

MAX:

python main.py with max_explore env_noise_stdev=0.02

Trajectory Variance Active Exploration:

python main.py with max_explore utility_measure=traj_stdev policy_explore_alpha=0.2 env_noise_stdev=0.02

Renyi Divergence Reactive Exploration:

python main.py with max_explore exploration_mode=reactive env_noise_stdev=0.02

Prediction Error Reactive Exploration:

python main.py with max_explore exploration_mode=reactive utility_measure=pred_err policy_explore_alpha=0.2 env_noise_stdev=0.02

Random Exploration:

python main.py with random_explore env_noise_stdev=0.02

Ant

MAX:

python main.py with max_explore env_name=MagellanAnt-v2 env_noise_stdev=0.02 eval_freq=1500 checkpoint_frequency=1500 ant_coverage=True

Trajectory Variance Active Exploration:

python main.py with max_explore env_name=MagellanAnt-v2 utility_measure=traj_stdev policy_explore_alpha=0.2 env_noise_stdev=0.02 eval_freq=1500 checkpoint_frequency=1500 ant_coverage=True

Renyi Divergence Reactive Exploration:

python main.py with max_explore env_name=MagellanAnt-v2 exploration_mode=reactive env_noise_stdev=0.02 eval_freq=1500 checkpoint_frequency=1500 ant_coverage=True

Prediction Error Reactive Exploration:

python main.py with max_explore env_name=MagellanAnt-v2 exploration_mode=reactive utility_measure=pred_err policy_explore_alpha=0.2 env_noise_stdev=0.02 eval_freq=1500 checkpoint_frequency=1500 ant_coverage=True

Random Exploration:

python main.py with random_explore env_name=MagellanAnt-v2 env_noise_stdev=0.02 eval_freq=1500 checkpoint_frequency=1500 ant_coverage=True

Magellan

Magellan is the internal code name of the project inspired by life of Ferdinand Magellan.

Name	Name	Last commit message	Last commit date
Latest commit wjaskowski ant_coverage for reporting the coverage in Ant environment is now a c… Jul 23, 2019 36d9ed6 · Jul 23, 2019 History 6 Commits
envs	envs	initial commit	May 13, 2019
.gitignore	.gitignore	initial commit	May 13, 2019
bare_metal_sac.py	bare_metal_sac.py	The computations are now fully deterministic	Jul 18, 2019
buffer.py	buffer.py	initial commit	May 13, 2019
conda_env.yml	conda_env.yml	The computations are now fully deterministic	Jul 18, 2019
imagination.py	imagination.py	The computations are now fully deterministic	Jul 18, 2019
logger.py	logger.py	initial commit	May 13, 2019
main.py	main.py	ant_coverage for reporting the coverage in Ant environment is now a c…	Jul 23, 2019
measures.py	measures.py	initial commit	May 13, 2019
models.py	models.py	initial commit	May 13, 2019
normalizer.py	normalizer.py	initial commit	May 13, 2019
readme.md	readme.md	ant_coverage for reporting the coverage in Ant environment is now a c…	Jul 23, 2019
sac.py	sac.py	initial commit	May 13, 2019
sacred_fetcher.py	sacred_fetcher.py	initial commit	May 13, 2019
tests.py	tests.py	initial commit	May 13, 2019
utilities.py	utilities.py	initial commit	May 13, 2019
wrappers.py	wrappers.py	initial commit	May 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Model-Based Active Exploration (MAX)

Overview:

Installation

Commands

Half Cheetah

Ant

Magellan

About

Releases

Packages

Contributors 2

Languages

nnaisense/MAX

Folders and files

Latest commit

History

Repository files navigation

Model-Based Active Exploration (MAX)

Overview:

Installation

Commands

Half Cheetah

Ant

Magellan

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages