This work implements the Deep Q Network (DQN), where a function approximate the state space because it’s too costly to make discretization of it.
Deep Neural network represent the mapping Q(s,a). Using Bellman equations we train the network while we train the parameters to produce the Q values.
We test 2 structures of network:
DQN_32
- Fully connected layer - input: 37 (state size) output:32
- Fully connected layer - input: 32 output 32
- Fully connected layer - input: 32 output: (action size)
DQN_64
- Fully connected layer - input: 37 (state size) output:64
- Fully connected layer - input: 64 output 64
- Fully connected layer - input: 64 output: (action size)
Parameters used in DQN algorithm:
- Maximum steps per episode: 2000
- Starting epsilion: 1.0
- Ending epsilion: 0.01
- Epsilion decay rate: 0.999
The agents were able to solve task in less than 600 episodes with pretty similar characteristics:
Model:
-
DQN_32:
-
DQN_64:
Investigation of the hyperparameters such as learning rates and batch sizes. The task could be completed faster using DQN enhancements such as dueling networks and prioritized experience replay.