How to use multiprocessing to populate a numpy array?

Question

I have a large numpy array of dimension (8,512,512,3) which is updated in a for loop running 8 times by calling a function on some images. Can I use multiprocessing / concurrent features to fill the numpy array in lesser time?

def myfun(arr):
    #some computation
    return out #Dimension of out is (512,512,50,3)

X = np.empty((8,512,512,50,3))
inp = np.ones((8,1000))

for i in range(8):
    X[i] = myfun(inp[i])

Update - I tried to use multiprocessing this way but it was slower than sequential

def myfun_mp(inp,return_list):
    return_list.append(myfun(inp))

manager = multiprocessing.Manager()
return_list = manager.list()
jobs = []

for i in range(8):
    p = multiprocessing.Process(target = myfun_mp, args = (inp[i],return_list))
    jobs.append(p)
    p.start()

for p in jobs:
    p.join()

    X = np.array(return_list) #This takes time

depends on the nature of the computation in `myfun(arr)`. If the computations are independent of each other then there should be no problem, i.e. if you split your computations into 8 blocks then the data conatained in each block should be sufficient to fill your array — DrBwts, Jul 17 '19 at 17:58
Cost exchanging between processes is high so I discourage it. Most of numpy function release global interpreter lock when entering c functions.it allows to take advantage of multithreading. — tstanisl, Jul 17 '19 at 18:00
@DrBwts the computation is independent of each other. I was having difficulty because the changes made to X in the child process will not be reflected in the parent process. — Srikar Ym, Jul 17 '19 at 19:24
@SrikarYm can you add the multithreaded code you have tried? — DrBwts, Jul 18 '19 at 12:17
@DrBwts Updated the post with the multiprocessing code I tried. — Srikar Ym, Jul 18 '19 at 13:43

How to use multiprocessing to populate a numpy array?

0 Answers0