initial bias towards action=1? #7

no-zzz-un · 2016-11-10T17:00:20Z

why does the network start with such a strong bias towards trying action 1 every timestep?

i only occasionally see action=0.

i looks like it would be difficult to break out of this pattern since it receives reward = 0.1 for it before encountering the first pipe-gate..

yanpanlau · 2017-03-12T12:36:27Z

During the initial stage of the training, the agent is simply perform random exploration...The network should able to learn "don't flap too much" after training a while..

no-zzz-un · 2017-03-12T12:38:48Z

@yanpanlau if the initial exploration is random, i would expect both actions to be equally likely initially? that isn't the case.

wobeert · 2018-01-26T00:18:03Z

How long is "a while"?
I trained on a 1080TI overnight and it didn't improve at all, or if it did it was not noticeable. It seemed to be performing almost the same actions from when it started training. The model seems to always crash at the very top of the first pipe. I tried training it from scratch but it didn't help. I tried messing around with the epsilon value but it didn't make much of a difference.
Anyone else have this issue?

yanpanlau · 2018-04-04T09:00:33Z

I just re-test it and it should converge after like 100,000 steps. Can you try with the latest code?

AloshkaD · 2018-07-01T14:40:40Z

Same issue here as @wobeert described. It didn't change even after 622000 steps. See below

AloshkaD · 2018-07-01T18:18:50Z

I fixed it! I introduced a bug by mistake to the original code when I was creating multigpu version for gpu keras. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

initial bias towards action=1? #7

initial bias towards action=1? #7

no-zzz-un commented Nov 10, 2016

yanpanlau commented Mar 12, 2017

no-zzz-un commented Mar 12, 2017

wobeert commented Jan 26, 2018

yanpanlau commented Apr 4, 2018

AloshkaD commented Jul 1, 2018

AloshkaD commented Jul 1, 2018 •

edited

Loading

initial bias towards action=1? #7

initial bias towards action=1? #7

Comments

no-zzz-un commented Nov 10, 2016

yanpanlau commented Mar 12, 2017

no-zzz-un commented Mar 12, 2017

wobeert commented Jan 26, 2018

yanpanlau commented Apr 4, 2018

AloshkaD commented Jul 1, 2018

AloshkaD commented Jul 1, 2018 • edited Loading

AloshkaD commented Jul 1, 2018 •

edited

Loading