BreakoutNoFrameskip-v4 does not advance to 3rd level, capping score at 864 #1618

dniku · 2019-07-25T22:20:08Z

This is a reopening of #309, as requested in that issue.

BreakoutNoFrameskip-v4 does not start a new level after all bricks are cleared twice. I was able to reproduce this with a well-trained cnn ppo2 Baselines model, although it seems that any model that can achieve a score of 864 will do (I have never seen a score of 864 exceeded).

Links:

reproducing script
model (download it as reproduce_gym_309.pkl and place next to the script)
gameplay video (the score of 864 is achieved at 03:49, and then it's just the paddle moving around trying to prevent the ball from falling down on an empty screen)

$ uname -srv
Linux 4.19.59-1-MANJARO #1 SMP PREEMPT Mon Jul 15 18:23:58 UTC 2019
$ python --version
Python 3.7.3

I ran all experiments in a virtualenv. Here are the commands that I executed to reproduce the issue:

virtualenv .env
source .env/bin/activate
pip install tensorflow-gpu gym[atari]
pip install git+https://github.com/openai/baselines.git
python reproduce_gym_309.py reproduce_gym_309.pkl

The script that I am providing simply loads the model and runs it, collecting gameplay frames, until the episode ends with a score of 864. Then it dumps the frames to a video file.

The output for me is (omitting log messages from Tensorflow and tqdm progress bar):

finished episode with reward=436.0, length=5799, elapsed_time=17.346772
finished episode with reward=735.0, length=4392, elapsed_time=30.163985
finished episode with reward=864.0, length=9447, elapsed_time=57.152439

pip list from virtualenv:

$ pip list 
Package              Version 
-------------------- --------
absl-py              0.7.1   
astor                0.8.0   
atari-py             0.2.6   
baselines            0.1.6   
Click                7.0     
cloudpickle          1.2.1   
future               0.17.1  
gast                 0.2.2   
google-pasta         0.1.7   
grpcio               1.22.0  
gym                  0.13.1  
h5py                 2.9.0   
joblib               0.13.2  
Keras-Applications   1.0.8   
Keras-Preprocessing  1.1.0   
Markdown             3.1.1   
numpy                1.16.4  
opencv-python        4.1.0.25
Pillow               6.1.0   
pip                  19.2.1  
protobuf             3.9.0   
pyglet               1.3.2   
scipy                1.3.0   
setuptools           41.0.1  
six                  1.12.0  
tensorboard          1.14.0  
tensorflow-estimator 1.14.0  
tensorflow-gpu       1.14.0  
termcolor            1.1.0   
tqdm                 4.32.2  
Werkzeug             0.15.5  
wheel                0.33.4  
wrapt                1.11.2

The text was updated successfully, but these errors were encountered:

dniku · 2019-07-26T11:40:16Z

@ludwigschubert could you take a look maybe?

dniku · 2019-07-26T13:23:55Z

For the record, here is the list of actions taken by the model (one action per line). Each action should be passed to the envs in a single-element list because envs are wrapped in DummyVecEnv:

with args.load_path.open('r') as fp:
    for action in tqdm(fp, postfix='playing'):
        obs, reward, done, infos = eval_envs.step([action])
        # ...

christopherhesse · 2019-07-26T16:59:57Z

Thanks for reopening this issue and providing more details! From looking at this video, I believe this is a bug in the game itself, and it looks like these sorts of bugs are being tracked on this issue: Farama-Foundation/Arcade-Learning-Environment#262 Could you post a comment there linking to this? I will likely close this issue later because this is a bug in ALE, not gym itself.

christopherhesse · 2019-07-26T21:18:25Z

Closing this issue as mentioned earlier, thanks for all the information, but it looks like this is up to ALE to decide what to do here.

dniku · 2019-07-30T21:39:07Z

According to Wikipedia, this is by design:

Once the second screen of bricks is destroyed, the ball in play harmlessly bounces off empty walls until the player restarts the game, as no additional screens are provided.

The score of 864 can be seen achieved here on a hardware Atari 2600.

This post also provides disassembly of the original game, showing the code which switches to the next level. It is basically if score == 432: refill_blocks().

dniku · 2019-08-02T11:07:02Z

Here is a super dirty proof of concept of how to make Breakout infinite: dniku/atari-py@fc1dc14

ludwigschubert assigned ludwigschubert and christopherhesse and unassigned ludwigschubert Jul 26, 2019

dniku mentioned this issue Jul 26, 2019

List of bugs in different games Farama-Foundation/Arcade-Learning-Environment#262

Open

christopherhesse closed this as completed Jul 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BreakoutNoFrameskip-v4 does not advance to 3rd level, capping score at 864 #1618

BreakoutNoFrameskip-v4 does not advance to 3rd level, capping score at 864 #1618

dniku commented Jul 25, 2019 •

edited

Loading

dniku commented Jul 26, 2019

dniku commented Jul 26, 2019

christopherhesse commented Jul 26, 2019

christopherhesse commented Jul 26, 2019

dniku commented Jul 30, 2019

dniku commented Aug 2, 2019 •

edited

Loading

BreakoutNoFrameskip-v4 does not advance to 3rd level, capping score at 864 #1618

BreakoutNoFrameskip-v4 does not advance to 3rd level, capping score at 864 #1618

Comments

dniku commented Jul 25, 2019 • edited Loading

dniku commented Jul 26, 2019

dniku commented Jul 26, 2019

christopherhesse commented Jul 26, 2019

christopherhesse commented Jul 26, 2019

dniku commented Jul 30, 2019

dniku commented Aug 2, 2019 • edited Loading

dniku commented Jul 25, 2019 •

edited

Loading

dniku commented Aug 2, 2019 •

edited

Loading