How to restore previous state to gym environment

Question

I'm trying to implement MCTS on Openai's atari gym environments, which requires the ability to plan: acting in the environment and restoring it to a previous state. I read that this can be done with the ram version of the games:

recording the current state in a snapshot: snapshot = env.ale.cloneState()

restoring the environment to a specific state recorded in snapshot: env.ale.restoreState(snapshot)

so I tried using the ram version of breakout:

env = gym.make("Breakout-ram-v0")
env.reset()

print("initial_state:")
plt.imshow(env.render('rgb_array'))
env.close()

# create first snapshot
snap0 = env.ale.cloneState()

executing the code above shows the image of the start of the game. We recorded the first state with snap0. Now let's play until the end:

while True:
    #is_done = env.ale.act(env.action_space.sample())[2]
    r = env.ale.act(env.action_space.sample())
    is_done = env.ale.game_over()
    if is_done:
        print("Whoops! We died!")
        break

print("final state:")
plt.imshow(env.render('rgb_array'))

executing the code above shows the image of the end of the game. now let's load the first state again to the environment:

env.ale.restoreState(snap0)
print("\n\nAfter loading snapshot")
plt.imshow(env.render('rgb_array'))

Instead of showing me the image of the start of the game, it shows me the same image of the end of the game. The environment is not reverting back even though I loaded the original first state.

If anyone got to work with ale and recording these kind of states, I'd really appreciate the help in figuring out what am I doing wrong. Thanks!

toxin9 · Accepted Answer · 2020-06-13T17:45:52.643

3

For anyone who comes across this in the future: There IS a bug in the arcade learning environment (ale) in the atari gym. The bug is in the original code written in C. restoring the original state from a snapshot changes the entire state back to the original, WITHOUT changing back the observation's picture or ram. Still, if you make another action after restoring the last state you get the next state with a correct image and ram. So basically if you don't need to draw images from the game, or save the ram of a specific state, You can play with restore without any problem. If you do need to see the image or ram of a current state, to use for a learning algorithm, then this is a problem. You need to save and remember the correct image when cloning, and using that saved image after restoring the state, instead of the image you get from getScreenRGB() after using the restoreState() function.

edited Jun 13 '20 at 17:45

answered Jun 13 '20 at 17:26

toxin9

81
7

Thanks! Had the exact same problem. May I ask how'd you find out about the bug? Also is there a way to, say, given 4 actions in an atari game, extract all possible rewards for each action without trying and restoring state each time? – Loqz Jun 29 '20 at 13:10
1

I just looked at the env properties and noticed that everything changed except the ram and the image of the state. So I tried making another step after restoring a snapshot and realized it works as if it was restored. I don't think you can extract a reward for an action without actually making that action. In the end, the gym environments are environments. Not models of the environments. So if you want to know something you have to act in the environment. – toxin9 Jul 05 '20 at 17:14

How to restore previous state to gym environment

1 Answers1