0

I am trying to run 10 parallel open AI gym environments, each on its own thread. The problem is that I want to save the transitions for each step in the env and access this when all the threads have finished. However, I cannot work out how create a separate instance of each of this lists and access them after from the main thread.

Any help would be greatly appreciated.

def run_episode(scale, modification_network, expert_network):
    with lock:
        rollouts_obs = []
        rollouts_action = []
        rollouts_reward = []
        rollouts_done = []
        env = BipedalWalker()
        env.update_scale(scale)
        # reset the environment to collect the first observation
        done = False
        obs = env.reset()
        while not done:
            action = env.action_space.sample()
            obs, reward, done, info = env.step(action)

            rollouts_obs.append(obs)
            rollouts_action.append(action)
            rollouts_reward.append(reward)
            rollouts_done.append(done)

jobs = []
for i in range(10):
    thread = threading.Thread(target=run_episode, args=(scale[i], agent, expert_net))
    jobs.append(thread)
    
for j in jobs:
    j.start()

for j in jobs:
    j.join()
Ljackson
  • 13
  • 4

1 Answers1

0

You could try to pass the same list (or other thread-safe data structure) as an argument to all run_episode threads and append your results to that list at the end of the function without returning anything. After all threads have joined, the list object should contain all your results in order of their completion. Note that lists are thread-safe, but their contents are not (Are lists thread-safe?), so only append to the list and never access the appended data during run_episode.