Note that I have to sweep through more argument sets than available CPUs, so I'm not sure if Python will automatically schedule the use of the CPUs depending on their availability or what.
Here is what I tried, but I get an error about the arguments:
import random
import multiprocessing
from train_nodes import run
import itertools
envs = ["AntBulletEnv-v0", "HalfCheetahBulletEnv-vo", "HopperBulletEnv-v0", "ReacherBulletEnv-v0",
"Walker2DBulletEnv-v0", "InvertedDoublePendulumBulletEnv-v0"]
algs = ["PPO", "A2C"]
seeds = [random.randint(0, 200), random.randint(200, 400), random.randint(400, 600), random.randint(600, 800), random.randint(800, 1000)]
args = list(itertools.product(*[envs, algs, seeds]))
num_cpus = multiprocessing.cpu_count()
with multiprocessing.Pool(num_cpus) as processing_pool:
processing_pool.map(run, args)
run
takes in 3 arguments: env, alg, and seed. For some reason here it doesn't register all 3.