0

I am trying to use Python multiprocessing. I wrapped my statements inside a function then I used the multiprocessing map to loop over the function. I found that only the first iteration was really processed yet the rest did not ( I checked from that by printing the result).

Here are my problems :

  1. Why only the first iteration was computed.
  2. How to return each array separately B, C, and D.
  3. My real calculations do have too many staff to calculate and to return, so is there is a more efficient than wrapping all my statement inside a function and then return them all. Thanks

import numpy as np
import multiprocessing as mp

B=np.full((5,4,4),np.nan)
C=np.full((5,4,4),np.nan)
D=np.full((5,4,4),np.nan)


def job1(i):
    print(i)
    A=np.ones((4,4))
    B[i,:,:]=A+1
    C[i,:,:]=2*A+2
    D[i,:,:]=A+5
    return B,C,D
#%%

P=mp.Pool(5)
result=P.map(job1,np.arange(5))
P.close()
P.join()



result[0] 
(array([[[ 2.,  2.,  2.,  2.],
         [ 2.,  2.,  2.,  2.],
         [ 2.,  2.,  2.,  2.],
         [ 2.,  2.,  2.,  2.]],

        [[nan, nan, nan, nan],
         [nan, nan, nan, nan],
         [nan, nan, nan, nan],
         [nan, nan, nan, nan]],

        [[nan, nan, nan, nan],
         [nan, nan, nan, nan],
         [nan, nan, nan, nan],
         [nan, nan, nan, nan]],

        [[nan, nan, nan, nan],
         [nan, nan, nan, nan],
         [nan, nan, nan, nan],
         [nan, nan, nan, nan]],

        [[nan, nan, nan, nan],
         [nan, nan, nan, nan],
         [nan, nan, nan, nan],
         [nan, nan, nan, nan]]]),
Emma
  • 27,428
  • 11
  • 44
  • 69
Kernel
  • 591
  • 12
  • 23
  • 1
    Each process in the pool creates an own _copy_ of the arrays then modifies its own B, C, D and returns it to main process (actually another copy is returned). numpy should have shared memory functionality (if I remember correctly) to avoid that. – Michael Butscher Jul 30 '19 at 03:38

1 Answers1

1
  1. Your code works as expected. You have 5 processors (Pool(5)) and 5 things to do (np.arange(5)), so each task is executed by each processor. Each calculation is not shared due to the reason @Michael Butscher mentioned in the comment.

  2. You can parse the result after you get it from Pool operation like below (an intuitive way);

output = {'B':[], 'C':[], 'D':[]}
for r in result:
    output['B'].append(r[0])
    output['C'].append(r[1])
    output['D'].append(r[2])
  1. It is hard to figure out the most efficient way to process your jobs not seeing the reproducible codes. To run multiple functions, please refer the followed link;

Mulitprocess Pools with different functions

Jin
  • 304
  • 2
  • 10
  • Thanks a lot for the answer, it helps me. Yet, I am still wondering how to get a full copy of the array B or C or D... I mean here the final array after combining the job of all tasks. Having 5 copies of the same array is not useful for the sake of future calculation. – Kernel Jul 30 '19 at 05:46