-
Notifications
You must be signed in to change notification settings - Fork 8.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BreakoutNoFrameskip-v4 does not advance to 3rd level, capping score at 864 #1618
Comments
@ludwigschubert could you take a look maybe? |
For the record, here is the list of actions taken by the model (one action per line). Each action should be passed to the envs in a single-element list because envs are wrapped in
|
Thanks for reopening this issue and providing more details! From looking at this video, I believe this is a bug in the game itself, and it looks like these sorts of bugs are being tracked on this issue: Farama-Foundation/Arcade-Learning-Environment#262 Could you post a comment there linking to this? I will likely close this issue later because this is a bug in ALE, not gym itself. |
Closing this issue as mentioned earlier, thanks for all the information, but it looks like this is up to ALE to decide what to do here. |
According to Wikipedia, this is by design:
The score of 864 can be seen achieved here on a hardware Atari 2600. This post also provides disassembly of the original game, showing the code which switches to the next level. It is basically |
Here is a super dirty proof of concept of how to make Breakout infinite: dniku/atari-py@fc1dc14 |
This is a reopening of #309, as requested in that issue.
BreakoutNoFrameskip-v4
does not start a new level after all bricks are cleared twice. I was able to reproduce this with a well-trainedcnn
ppo2
Baselines model, although it seems that any model that can achieve a score of 864 will do (I have never seen a score of 864 exceeded).Links:
reproduce_gym_309.pkl
and place next to the script)I ran all experiments in a virtualenv. Here are the commands that I executed to reproduce the issue:
The script that I am providing simply loads the model and runs it, collecting gameplay frames, until the episode ends with a score of 864. Then it dumps the frames to a video file.
The output for me is (omitting log messages from Tensorflow and tqdm progress bar):
pip list
from virtualenv:The text was updated successfully, but these errors were encountered: