Offline Reinforcement Learning

Overview | Installation | Agents | Examples |

Overview

This repository is a part of my master thesis project at UCL. It builds upon the acme framework and implements two new offline RL algorithms.

The experiments here are run on the MiniGrid environemnt, but the code is modular and a new environemnt can be tested simply by implementing a new _build_environment() func that returns an environment in appropriate wrappers.

Installation

An example of a working environment is set up in each of the example colaboratory notebooks provided.

Agents

This repo implements 3 different algorithms:

Conservative Q-learning (CQL)
Critic Regularized Regression (CRR)
Behavioural Cloning adopted from acme

Examples

after setting up a wandb account, all the results of our experiments along with the versioned datasets can be accessed here

New datasets can be easily collected using the dataset_collection_pipeline colab notebook.

Experiments can be run from run_experiment_pipeline notebook.

Both of these notebooks are well documented. Each new experiment that is run is tracked and checkpointed to WandB. If you'd like to resume an existing run, it is sufficient to pass the specific run_id as a '--wandb_id' flag to any of the algorithm run scripts.

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
colab_pipelines		colab_pipelines
experiments		experiments
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Offline Reinforcement Learning

Overview

Installation

Agents

Examples

About

Releases

Packages

Languages

Kmeco/offline-rl

Folders and files

Latest commit

History

Repository files navigation

Offline Reinforcement Learning

Overview

Installation

Agents

Examples

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages