Parallel programming array handling in Python

Question

I have seen a number of questions on using python's multiprocessing but I have not been able to quite wrap my head around on how to use this on my code.

Suppose I have a NxM array. I have a function f that compares the value at (i,j) to every other pixel. So, in essence, I compute NxM values at every point on the grid.

My machine has four cores. I envision that I would split the input locations in four quadrants and then feed each quadrant to a different process.

So, schemetically, my current code would be:

def f(x, array):
    """
    For input location x, do some operation 
    with every other array value. As a simple example,
    this could be the product of array[x] with all other
    array values
    """
    return array[x]*array

if __name__ == '__main__':
    array = some_array
    for x in range(array.size):
        returnval = f(x,array)

`

What would be the best strategy in optimizing a problem like this using multiprocessing?

If your problem are mathemathical you may want to look at `numpy`. Despite being bound to a single-core performance by GIL in CPython, numpy code releases it so your code can be even more effective with threading+numpy than with multiprocessing. — myaut, Nov 07 '15 at 22:48
@bbayles yes, it would have to be another array of the same size, NxM. Retaining the order is thus important. — user1991, Nov 08 '15 at 08:58
@myaut could you elaborate on how numpy specifically would help me here? In addition, as I understand it threading is particularly useful for I/O, not so much for heavy calculations? Thanks! — user1991, Nov 08 '15 at 08:58
@John: Check this answer: http://stackoverflow.com/questions/6200437/numpy-and-global-interpreter-lock — myaut, Nov 09 '15 at 11:49

score 1 · Answer 1 · edited May 23 '17 at 12:29

The multiprocessing library can do most of the heavy lifting for you, with its Pool.map function.

The only limitation is that will use a function that takes a single argument, the value you are iterating. Based on this other question, the final implementation would be.

from multiprocessing import Pool
from functools import partial

# Make a version of f that only takes x parameter
partial_f = partial(f, array=SOME_ARRAY)

if __name__ == '__main__':
    pool = Pool(processes=5)  # Number of CPUs + 1
    returnval = pool.map(partial_f, range(array.size))

score 0 · Answer 2 · answered Nov 07 '15 at 20:55

The strategy you've described - send the data set to several workers, have each compute values for part of the data set, and stitch them together at the end - is the classical way to parallelize computation for an algorithm like this where answers don't depend on each other. This is what MapReduce does.

That said, introducing parallelism makes programs harder to understand, and harder to adapt or improve later. Both the algorithm you've described, and the use of raw Python for large numerical computations, might be better things to change first.

Consider checking whether expressing your calculation with numpy arrays makes it run faster, and whether you can find or develop a more efficient algorithm for the problem you're trying to solve, before parallelizing your code.

Parallel programming array handling in Python

2 Answers2