How to return numpy array from multiprocessing pool map?

Question

I have a function "func" that receives a list containing the number of rows of image called "matrix_image":

list_rows = range(N_rows)

Then, computation is done inside func and I get a new row of the resultant matrix representing an image.

def func(list_rows):
    new_row = numpy.empty(N_columns)
    ....
    ....# some computation
    ....
    return new_row


matrix_image = pool.map(func, list_n_rows )

"new_row" is correctly calculated inside func because I see values in debugger, but resultant matrix_image has shape: (list_n_rows,). Filled with all values at None.

How to return a (1D numpy array) row from func to be appended into a resultant matrix (2D numpy array)?

With `return`? If something went wrong when you tried it, you're going to have to show us a [mcve] of the problem. — user2357112, May 22 '17 at 17:25
I have added the clear problem with the resultant values obtained in map. This example is absolutely complete for the question. — eduardosufan, May 22 '17 at 17:42
@eduardosufan, you've misunderstood what a MCVE is. You need to post code that can be pasted directly into a script and run without modification, and that reproduces the problem. When I created a basic working script based on what you have here, it worked as expected -- the arrays were returned correctly. — senderle, May 22 '17 at 18:31
Take a look at the following post and maybe you find the way: [enter link description here](https://stackoverflow.com/questions/25888255/how-to-use-python-multiprocessing-pool-map-to-fill-numpy-array-in-a-for-loop) — Radmar, May 23 '17 at 08:55

score 3 · Answer 1 · edited Jan 18 '21 at 13:58

You may use the RawArray functionality of multiprocessing where you define the variable that needs to be accessed from the process before starting the process as a RawArray and then after the process has finished access it as a reshaped numpy array.

Here is an example:

import numpy as np
import multiprocessing as mp

n_elements = 1000 # how many elements your numpy should have

def myProc( shared_var ):
    '''
    here you convert your shared variable from mp.RawArray to numpy
    then treat it as it is numpy array e.g. fill it in with some 
    random numbers for demonstration purpose
    '''
    var = np.reshape( np.frombuffer( shared_var, dtype=np.uint32 ), -1 )
    for i in range( n_elements ):    
        var[i] = np.random.randint( 0, 2**16, 1 )
    print( 'myProc var.mean() = ', var.mean() )               
                                  
#buffer that contains the memory
mp_var = mp.RawArray( 'i', n_elements )
p = mp.Process( target=myProc, args=(mp_var,) )
p.start()                                      
p.join()

#after the process has ended, you convert the buffer that was passed to it
var = np.reshape( np.frombuffer( mp_var, dtype=np.uint32 ), -1)
#and again, you can treat it like a numpy array
print( '   out var.mean() = ',var.mean() )

the output is:

myProc var.mean() =  32612.403
       var.mean() =  32612.403

hope that helps!

Please note that if you access this buffer from concurrent processes you need to organise a proper locking mechanism so no two processes modify the same piece of memory at the same time.

`mp_buff` seems to be what you refer to as `mp_var`, am I correct? — Ruli, Jan 18 '21 at 12:28

How to return numpy array from multiprocessing pool map?

1 Answers1