The trick is to initialize each process in the pool so that the random number generator is seeded with a unique seed. This is achieved by using the initializer argument of the Pool
constructor.
The first demo uses the same seed for each process in the pool and shows that all processes will be returning the same random numbers (this is not what you want because each process in the pool is staring off with a random number generator that is in the same identical initial state):
import numpy as np
import multiprocessing
import time
def init_pool():
np.random.seed(1)
def worker(i):
# ensure each process in the pool processes one request each:
time.sleep(1)
return multiprocessing.current_process().pid, np.random.random()
if __name__ == '__main__':
pool = multiprocessing.Pool(8, initializer=init_pool)
results = pool.map(worker, range(8))
for pid, number in results:
print(f'pid={pid}, random number={number}')
Prints:
pid=46512, random number=0.417022004702574
pid=3444, random number=0.417022004702574
pid=13716, random number=0.417022004702574
pid=10800, random number=0.417022004702574
pid=47360, random number=0.417022004702574
pid=49932, random number=0.417022004702574
pid=51144, random number=0.417022004702574
pid=27360, random number=0.417022004702574
Note that on Linux/Unix the state of the random number generator would be inherited by all processes in the pool and thus they would automatically have the same initial identical state even without specifying a pool-initializer function as in the above code, which is, however, required for a platform such as Windows that uses spawn
to create new processes.
The next demo initialized each process's random number generator with the current process' pid value (this is what you want since it guarantees that each processor in the pool starts off with a random number generator initialized with its own unique state):
import numpy as np
import multiprocessing
import time
def init_pool():
np.random.seed(multiprocessing.current_process().pid)
def worker(i):
# ensure each process in the pool processes one request each:
time.sleep(1)
return multiprocessing.current_process().pid, np.random.random()
if __name__ == '__main__':
pool = multiprocessing.Pool(8, initializer=init_pool)
results = pool.map(worker, range(8))
for pid, number in results:
print(f'pid={pid}, random number={number}')
Prints:
pid=19460, random number=0.5645643493822622
pid=23612, random number=0.5480593060571878
pid=28288, random number=0.2637370242174355
pid=6440, random number=0.6107958535345932
pid=24452, random number=0.6173634654672119
pid=1716, random number=0.2570898341750626
pid=14912, random number=0.11239641110464715
pid=49184, random number=0.34255660011034006