How can I make many perallel processes make changes to a single shared NumPy array?

Question

I have scoured the internet for an answer, and nothing I can find applies to my situation. I have read about multiprocessing.Manager, have tried passing things back and forth, and none of it seems to play well with NumPy arrays.I ahve tried using Pool instead, but my target method does not return anything, it just makes changes to an array, so I wasn't sure how to set that up either.

Right Now I have:

def Multiprocess(self, sigmaI, sigmaX):
    cpus = mp.cpu_count()
    print('Number of cpu\'s to process WM: %d' % cpus)

    processes = [mp.Process(target = self.CreateMatrixMp, args = (sigmaI, sigmaX, i,)) for i in range(self.numPixels)]
    for p in processes:
        p.start()
    for p in processes:
        p.join()

The target function, CreateMatrixMp, takes the values passed, and after doing calculations, appends a value to an array data. This array is declared as self.data = numpy.zeros(self.size, numpy.float64). If the details of the CreateMatrixMp method would help, I can post that as well.

I tried adding this above where the processes are run:

mgr = mp.Manager()
sharedData = mgr.Array(ctypes.c_numpy.float64, self.data)

and then passing sharedDatato CreateMatrixMp, where it can be modified. Once all the processes have run and the array is complete, I simply do self.data = sharedData.

But this doesn't work (though I know I am not setting it up correctly). How should this be done with a NumPy array? I want each and every process (there will be thousands of them) to append to the same array.

Any help is enormously appreciated.

Can you elaborate on why the problem requires thousands of processes? You might reach a point of diminishing return well before that, see http://stackoverflow.com/questions/20039659/python-multiprocessings-pool-process-limit — paisanco, Jul 28 '15 at 22:50
Because this is for image application, and the CreateMatrixMp method will run ((# of pixels)^2) times. It creates a weight on the edge between each and every pixel (node). — pretzlstyle, Jul 28 '15 at 23:47
Unless I'm not understanding that sounds like the kind of problem that would be solved by convolving the image with a kernel. I still don't understand why thousands of processes would be needed. — paisanco, Jul 29 '15 at 00:57
@paisanco thousands of processes are only needed because I want to parallelize it. The algorithm loops through a function that focuses on a single pixel, and then compares that to every other pixel in the image in a `for` loop, assigning an edge value to each one. Then that is preformed on every pixel. So actually there are only (# of pixels) number of processes, since each process loops through the entire image. But still, for a large image that would be thousands. What is meant by "convoloving the image with a kernel"? — pretzlstyle, Jul 29 '15 at 15:16
Did you already consider IPython's parallel map (http://ipython.org/ipython-doc/dev/parallel/parallel_multiengine.html#quick-and-easy-parallelism)? — Dietrich, Jul 29 '15 at 19:34

score 1 · Accepted Answer · answered Jul 28 '15 at 22:48

1

Welcome to the dark world of multiple threads. I think your big problem here is the mgr.Array puts synchronisation around the array. If you generate data quickly this will be a bottle-neck since processes will be waiting for the last to finish with the array. It is more efficient and will help if each process keeps a private copy of the nump array. Once you have fed in all the data then wait for all the processes to complete. Then you can combine all the arrays into self.data. This way none of the processed need wait on a shared resource. Neither this solution, nor yours, guarantee the order of the output list. I suspect from self.numPixels that order may be important. Many solutions, but the easiest is to feed in order with the data and do a self.data.sort(...) after all is done. Alternatively and faster, pre-create self.data and have the processes poke results in the correct location. self.data does not need to be a shared data structure since the processes are never changing anything in common. This works if arrays map to C-like arrays. It will not work for linked lists, etc. Hope this helps. Ask if you want more details.

answered Jul 28 '15 at 22:48

Paul Marrington

557
2
7

Thank you for the answer. I agree that it would be better to have private arrays, and just do one big write at the end once every value is found. But I also agree that the processes are never changing anything in common (at least they shouldn't). Does that mean that they can edit the array in different places simultaneously, and nothing will have to wait? When I try to run it like this, I for some reason get an array full of zeroes, as if the processes are erasing each others values. – pretzlstyle Jul 29 '15 at 15:19
As I commented at the end, it will only work for static arrays and depends on the Python implementation. The same goes for NumPy arrays - except that they are different from Python lists so may be generated differently. It is unlikely (though not impossible) that the processes are erasing other data. It is possible that you are getting copies rather than a reference to the data. I should not have mentioned the approach since it relies on language internal implementation that may change between systems (2.7v3, PyPy, CPython or Jython, etc). Having said that, it should have worked. – Paul Marrington Jul 29 '15 at 22:10
Just putting together a sample when I cam across http://stackoverflow.com/questions/25938187/trying-to-use-multithreading-to-fill-an-array-in-python One respondent points out that my suggestion will work with multiple threads but not multiple processes. The latter are independent and have their own memory spaces. Interestingly the last responses introduce multiprocessing.pool which if not told otherwise creates a thread for each cpu count - similar to your original example. – Paul Marrington Jul 29 '15 at 22:21
I have ended up using `Pool` and it is working well. Thanks for your advice, I will mark this as answered. – pretzlstyle Jul 30 '15 at 19:27

How can I make many perallel processes make changes to a single shared NumPy array?

1 Answers1