I am looking for a way to start two functions in parallel, each executing over a given set of different arguments. I use pool.map
to achieve this. I create two different processes and each process starts a pool executing the map
. This works - the order of execution is a little bit wild, but I will save this for another question.
Now I have also found another approach here (see the first answer). They are using only one pool and call map_async
two times in a row. So I was wondering, if there is a preferred way for doing this?
Because I have read (sadly I don't remember where) that it is best to only use one pool, which would mean the second approach (using only one pool) is better. But when I measure the time, the first approch (using two pools in separate processes) is actually a little bit faster. Additionally in the first approach the functions are really running in parallel, whereas in the second approach first the first call of map_async
executes, then the second call.
Here is my test code:
from multiprocessing import Process, Pool
import time
import os
multiple_pools = True
data = list(range(1, 11))
def func_a(param):
print(f'running func_a in process {os.getpid()}')
print(f'passed argument: {param}')
print('calculating...\n')
time.sleep(1.5)
print('done\n')
def func_b(param):
print(f'running func_b in process {os.getpid()}')
print(f'passed argument: {param}')
print('calculating...\n')
time.sleep(2.5)
print('done\n')
def execute_func(func, param):
p = Pool(processes=8)
with p:
p.map(func, param)
if __name__ == '__main__':
if not multiple_pools:
t0 = time.time()
p = Pool(processes=8)
res = p.map_async(func_a, data)
res = p.map_async(func_b, data)
p.close()
p.join()
t1 = time.time()
dt = t1 -t0
print(f'time spent with one pool: {dt} s')
else:
t0 = time.time()
p1 = Process(target=execute_func, args=(func_a, data))
p2 = Process(target=execute_func, args=(func_b, data))
p1.start()
p2.start()
p1.join()
p2.join()
p1.close()
p2.close()
t1=time.time()
dt = t1 -t0
print(f'time spent with two pools, each inside an own process: {dt} s')
So again, my question: is there one way preferred over the other? Or maybe even other/better ways to do this?