Skip to content

Commit

Permalink
update README
Browse files Browse the repository at this point in the history
  • Loading branch information
HokageM committed Dec 9, 2023
1 parent c8ec0c4 commit b60980b
Showing 1 changed file with 26 additions and 9 deletions.
35 changes: 26 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,42 @@

Inverse Reinforcement Learning Algorithm implementation with python.

# Exploring Maximum Entropy Inverse Reinforcement Learning

My seminar paper can be found in [paper](https://github.com/HokageM/IRLwPython/tree/main/paper), which is based on
IRLwPython version 0.0.1

# Implemented Algorithms

## Maximum Entropy IRL:

Implementation of the Maximum Entropy inverse reinforcement learning algorithm from [1] and is based on the implementation
Implementation of the Maximum Entropy inverse reinforcement learning algorithm from [1] and is based on the
implementation
of [lets-do-irl](https://github.com/reinforcement-learning-kr/lets-do-irl/tree/master/mountaincar/maxent).
It is an IRL algorithm using Q-Learning with a Maximum Entropy update function.

## Maximum Entropy Deep IRL:
## Maximum Entropy IRL (MEIRL):

Implementation of the maximum entropy inverse reinforcement learning algorithm from [1] and is based on the
implementation
of [lets-do-irl](https://github.com/reinforcement-learning-kr/lets-do-irl/tree/master/mountaincar/maxent).
It is an IRL algorithm using q-learning with a maximum entropy update function for the IRL reward estimation.
The next action is selected based on the maximum of the q-values.

## Maximum Entropy Deep IRL (MEDIRL:

An implementation of the Maximum Entropy inverse reinforcement learning algorithm, which uses a neural-network for the
actor.
The estimated irl-reward is learned similar as in Maximum Entropy IRL.
It is an IRL algorithm using Deep Q-Learning with a Maximum Entropy update function.
An implementation of the maximum entropy inverse reinforcement learning algorithm, which uses a neural-network for the
actor.
The estimated irl-reward is learned similar as in MEIRL.
It is an IRL algorithm using deep q-learning with a maximum entropy update function.
The next action is selected based on an epsilon-greedy algorithm and the maximum of the q-values.

## Maximum Entropy Deep RL:
## Maximum Entropy Deep RL (MEDRL):

An implementation of the Maximum Entropy reinforcement learning algorithm.
This algorithm is used to compare the IRL algorithms with an RL algorithm.
MEDRL is a RL implementation of the MEDIRL algorithm.
This algorithm gets the real rewards directly from the environment,
instead of estimating IRL rewards.
The NN architecture and action selection is the same as in MEDIRL.

# Experiment

Expand Down

0 comments on commit b60980b

Please sign in to comment.