How to run multiple open AI gym envs in parallel

Question

I am trying to run 10 parallel open AI gym environments, each on its own thread. The problem is that I want to save the transitions for each step in the env and access this when all the threads have finished. However, I cannot work out how create a separate instance of each of this lists and access them after from the main thread.

Any help would be greatly appreciated.

def run_episode(scale, modification_network, expert_network):
    with lock:
        rollouts_obs = []
        rollouts_action = []
        rollouts_reward = []
        rollouts_done = []
        env = BipedalWalker()
        env.update_scale(scale)
        # reset the environment to collect the first observation
        done = False
        obs = env.reset()
        while not done:
            action = env.action_space.sample()
            obs, reward, done, info = env.step(action)

            rollouts_obs.append(obs)
            rollouts_action.append(action)
            rollouts_reward.append(reward)
            rollouts_done.append(done)

jobs = []
for i in range(10):
    thread = threading.Thread(target=run_episode, args=(scale[i], agent, expert_net))
    jobs.append(thread)
    
for j in jobs:
    j.start()

for j in jobs:
    j.join()

You can return data from your thread as the function's return value. — Anmol Singh Jaggi, Aug 28 '20 at 13:09
@AnmolSinghJaggi, if this is the case, how do you change the variable names so to have each thread save in a different place? — Ljackson, Sep 01 '20 at 10:41
Did you figure out a way of doing it, I am experiencing the same issue ? — BAKYAC, Jan 14 '21 at 14:27
No! I never came up with a fix - would love to know if you do!? — Ljackson, Jan 15 '21 at 15:11

score 0 · Answer 1 · answered Oct 17 '21 at 12:22

You could try to pass the same list (or other thread-safe data structure) as an argument to all run_episode threads and append your results to that list at the end of the function without returning anything. After all threads have joined, the list object should contain all your results in order of their completion. Note that lists are thread-safe, but their contents are not (Are lists thread-safe?), so only append to the list and never access the appended data during run_episode.

How to run multiple open AI gym envs in parallel

1 Answers1