0

I'm looking to run the bottleneck in my program on multiple processors locally. I've looked into multiprocessing but it looks a little complicated for my purposes and I'm wondering if there is a simpler way.

I have a loop over 360 angles with a calculation that is independent for each one, so it doesn't matter what order it is done in. I have 8 cores so was hoping I could simply split that loop into 8 chunks of 45 angles and send those to different cores and collect the result at the end. The simplified example looks something like this:

dx = np.zeros(npixels)
for angle in range(360):
    dx += calculate_gradient_for_angle(angle, x, y, z, **kwargs)

Here, only the angle argument is variable. The rest are static.

I've looked into multiprocessing.pool.Pool.map but I can only find examples that show single-argument functions passed to it. As you can see my function takes multiple arguments. Any pointers would be much appreciated.

I'm using Python 3.7.8 on macOS 10.14.6

miterhen
  • 175
  • 1
  • 11
  • "but that only takes one argument" what? – juanpa.arrivillaga Mar 23 '21 at 19:58
  • In any case, there is not less complicated way of leveraging multiple cores than using `multiprocessing`. that being said, a lot of `numpy` and `scipy` code may already be leveraging multiple cores, since they are calling native binaries – juanpa.arrivillaga Mar 23 '21 at 20:00
  • that one parameter can be a tuple, a list, a range a class instance, whatever you need to – Mirronelli Mar 23 '21 at 20:03
  • You can use [threading](https://docs.python.org/3/library/threading.html) to easily distribute chunks to threads and have processing run in parallel. – PApostol Mar 23 '21 at 20:31
  • @juanpa.arrivillaga, I meant that all examples I can find just show how to use `multiprocessing.pool.Pool.map` with a function that itself takes one argument. I've amended my post. Looking at the CPU load when my program runs, I doubt `numpy` is using multiple cores. – miterhen Mar 23 '21 at 22:11
  • @PApostol I thought the GIL meant that thread parallelism isn't true parallelism. Hence my question about `multiprocessing`. – miterhen Mar 23 '21 at 22:14
  • @miterhen yes, absolutely threading in CPython will not use multiple cores – juanpa.arrivillaga Mar 23 '21 at 22:31

2 Answers2

0

You could build something like this. I got the idea from this question.

import concurrent.futures

def foo(bar):
    return bar

# Start each calculation in a seperate thread
with concurrent.futures.ThreadPoolExecutor() as ex:
    f = [ex.submit(foo, angle) for angle in range(10)]

dx = 0

# Collect all results
for r in f:
  dx += r.result()

print(dx)
Mario
  • 159
  • 8
0

The simple solution was to use functools.partial to turn the function into a one-argument function and pass that to multiprocessing.pool.Pool.map as follows:

from functools import partial
import multiprocessing as mp

calculate_grad_partial = partial(calculate_grad_for_angle, x=x, y=y, z=z)

with mp.Pool(processes=8) as pool:
    dx = pool.map(calculate_grad_partial, range(360))
miterhen
  • 175
  • 1
  • 11