There are several issues with the code. First, if you want to run this under Jupyter Notebook in Windows then you need to put your worker functions func1
and func2
in an external module, for example, workers.py
and import them and that means you now need to either pass the Numbers
array as an argument to the workers or initialize static storage of each process with the array when you initialize the pool. We will you the second method with a function called init_pool
, which also has to be imported if we are running under Notebook:
workers.py
def func1():
Solution_1 = Numbers + 10
return Solution_1
def func2():
Solution_2 = Numbers * 10
return Solution_2
def init_pool(n_array):
global Numbers
Numbers = n_array
The second issue is that when running under Windows, the code that creates sub-processes or a multiprocessing pool must be within a block that is governed by a conditional if __name__ == '__main__':
. Third, it is wasteful to create a pool size greater than 2 if you are only trying to run two parallel "jobs." And fourth, and I think finally, you are using the wrong pool method. apply
will block until the "job" submitted (i.e. the one processed by func1
) completes and so you are not achieving any degree of parallelism at all. You should be using apply_async
.
import multiprocessing as mp
import numpy as np
from workers import func1, func2, init_pool
if __name__ == '__main__':
#num_cores = mp.cpu_count()
Numbers = np.array([1,2,3,4,5,6,7,8,9,10,11,12])
pool = mp.Pool(2, initializer=init_pool, initargs=(Numbers,)) # more than 2 is wasteful
# This is to use all functions easily
functions = [func1, func2]
# This is to store the results
solutions = []
results = [pool.apply_async(function) for function in functions]
for result in results:
solutions.append(result.get()) # wait for completion and get the result
print(solutions)
Prints:
[array([11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22]), array([ 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120])]
