Skip to content

Commit

Permalink
Add CleanRL mutli-agent Atari example (#1033)
Browse files Browse the repository at this point in the history
  • Loading branch information
elliottower authored Jul 20, 2023
1 parent 6a04989 commit 6a20a32
Show file tree
Hide file tree
Showing 7 changed files with 411 additions and 18 deletions.
3 changes: 2 additions & 1 deletion .github/workflows/linux-tutorials-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,5 +33,6 @@ jobs:
cd tutorials/${{ matrix.tutorial }}
pip install -r requirements.txt
pip uninstall -y pettingzoo
pip install -e $root_dir
pip install -e $root_dir[testing]
AutoROM -v
for f in *.py; do xvfb-run -a -s "-screen 0 1024x768x24" python "$f"; done
26 changes: 26 additions & 0 deletions docs/tutorials/cleanrl/advanced_PPO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
title: "CleanRL: Advanced PPO"
---

# CleanRL: Advanced PPO

This tutorial shows how to train [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agents on [Atari](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environments ([Parallel](https://pettingzoo.farama.org/api/parallel/)).
This is a full training script including CLI, logging and integration with [TensorBoard](https://www.tensorflow.org/tensorboard) and [WandB](https://wandb.ai/) for experiment tracking.

This tutorial is mirrored from [CleanRL](https://github.com/vwxyzjn/cleanrl)'s examples. Full documentation and experiment results can be found at [https://docs.cleanrl.dev/rl-algorithms/ppo/#ppo_pettingzoo_ma_ataripy](https://docs.cleanrl.dev/rl-algorithms/ppo/#ppo_pettingzoo_ma_ataripy)

## Environment Setup
To follow this tutorial, you will need to install the dependencies shown below. It is recommended to use a newly-created virtual environment to avoid dependency conflicts.
```{eval-rst}
.. literalinclude:: ../../../tutorials/CleanRL/requirements.txt
:language: text
```

Then, install ROMs using [AutoROM](https://github.com/Farama-Foundation/AutoROM), or specify the path to your Atari rom using the `rom_path` argument (see [Common Parameters](/environments/atari/#common-parameters)).

## Code
The following code should run without any issues. The comments are designed to help you understand how to use PettingZoo with CleanRL. If you have any questions, please feel free to ask in the [Discord server](https://discord.gg/nhvKkYa6qX), or create an issue on [CleanRL's GitHub](https://github.com/vwxyzjn/cleanrl/issues).
```{eval-rst}
.. literalinclude:: ../../../tutorials/CleanRL/cleanrl_advanced.py
:language: python
```
2 changes: 1 addition & 1 deletion docs/tutorials/cleanrl/implementing_PPO.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: "CleanRL: Implementing PPO"

# CleanRL: Implementing PPO

This tutorial shows how to train a [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agennt on the [Pistonball](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environment ([Parallel](https://pettingzoo.farama.org/api/parallel/)).
This tutorial shows how to train [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agents on the [Pistonball](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environment ([Parallel](https://pettingzoo.farama.org/api/parallel/)).

## Environment Setup
To follow this tutorial, you will need to install the dependencies shown below. It is recommended to use a newly-created virtual environment to avoid dependency conflicts.
Expand Down
9 changes: 6 additions & 3 deletions docs/tutorials/cleanrl/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ title: "CleanRL"

This tutorial shows how to use [CleanRL](https://github.com/vwxyzjn/cleanrl) to implement a training algorithm from scratch and train it on the Pistonball environment.

* [Implementing PPO](/tutorials/cleanrl/implementing_PPO.md): _Implement and train an agent using PPO_
* [Implementing PPO](/tutorials/cleanrl/implementing_PPO.md): _Train an agent using a simple PPO implementation_

* [Advanced PPO](/tutorials/cleanrl/advanced_PPO.md): _CleanRL's official PPO example, with CLI, TensorBoard and WandB integration_


## CleanRL Overview
Expand All @@ -16,14 +18,14 @@ This tutorial shows how to use [CleanRL](https://github.com/vwxyzjn/cleanrl) to

See the [documentation](https://docs.cleanrl.dev/) for more information.

## Official examples using PettingZoo:
## Examples using PettingZoo:

* [PPO PettingZoo Atari example](https://docs.cleanrl.dev/rl-algorithms/ppo/#ppo_pettingzoo_ma_ataripy)


## WandB Integration

A key feature is its tight integration with [Weights & Biases](https://wandb.ai/) (WandB): for experiment tracking, hyperparameter tuning, and benchmarking.
A key feature is CleanRL's tight integration with [Weights & Biases](https://wandb.ai/) (WandB): for experiment tracking, hyperparameter tuning, and benchmarking.
The [Open RL Benchmark](https://github.com/openrlbenchmark/openrlbenchmark) allows users to view public leaderboards for many tasks, including videos of agents' performance across training timesteps.


Expand All @@ -38,4 +40,5 @@ The [Open RL Benchmark](https://github.com/openrlbenchmark/openrlbenchmark) allo
:caption: CleanRL
implementing_PPO
advanced_PPO
```
12 changes: 0 additions & 12 deletions docs/tutorials/sb3/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,25 +34,13 @@ For non-visual environments, we use [MLP](https://stable-baselines3.readthedocs.
```




## Stable-Baselines Overview

[Stable-Baselines3](https://stable-baselines3.readthedocs.io/en/master/) (SB3) is a library providing reliable implementations of reinforcement learning algorithms in [PyTorch](https://pytorch.org/). It provides a clean and simple interface, giving you access to off-the-shelf state-of-the-art model-free RL algorithms. It allows training of RL agents with only a few lines of code.

For more information, see the [Stable-Baselines3 v1.0 Blog Post](https://araffin.github.io/post/sb3/)


[//]: # (```{eval-rst})

[//]: # (.. warning::)

[//]: # ()
[//]: # ( Note: SB3 is designed for single-agent RL and does not plan on natively supporting multi-agent PettingZoo environments. These tutorials are only intended for demonstration purposes, to show how SB3 can be adapted to work in multi-agent settings.)

[//]: # (```)


```{figure} https://raw.githubusercontent.com/DLR-RM/stable-baselines3/master/docs/_static/img/logo.png
:alt: SB3 Logo
:width: 80%
Expand Down
Loading

0 comments on commit 6a20a32

Please sign in to comment.