Implement Parallel for loops in Python

Question

I have a Python program which looks like this:

total_error = []
for i in range(24):
    error = some_function_call(parameters1, parameters2)
    total_error += error

The function 'some_function_call' takes a lot of time and I can't find an easy way to reduce time complexity of the function. Is there a way to still reduce the execution time while performing parallel tasks and later adding them up in total_error. I tried using pool and joblib but could not successfully use either.

Is `some_function_call()` CPU or I/O bound? If it is CPU bound, take a look at the first answer to [this](https://stackoverflow.com/questions/21959355/python-multiprocessing-with-a-single-function) post. That is a very simple implementation using the `multiprocessing` library. — pstatix, Jan 03 '18 at 17:30
I am not sure whether to classify it as CPU or I/O bound. The function basically calculates the least cost walk in a graph. The input that I give to the function is typically two vectors of 1200 length. — thechargedneutron, Jan 03 '18 at 17:34
The work is CPU bound. When determining if work is CPU or I/O bound, think in terms of "If my CPU was faster, would the work be done faster?". For example, calculations are generally CPU bound, but reading and writing to file are not (they are either network I/O or disk I/O). — pstatix, Jan 03 '18 at 17:44
It seems like you should probably use pool, can you post your attempt at it so we can troubleshoot? — poompt, Jan 03 '18 at 17:52

score 21 · Accepted Answer · answered Jan 03 '18 at 18:03

You can also use concurrent.futures in Python 3, which is a simpler interface than multiprocessing. See this for more details about differences.

from concurrent import futures

total_error = 0

with futures.ProcessPoolExecutor() as pool:
  for error in pool.map(some_function_call, parameters1, parameters2):
    total_error += error

In this case, parameters1 and parameters2 should be a list or iterable of the same size as the number of times you want to run the function (24 times as per your example).

If paramters<1,2> are not iterables/mappable, but you just want to run the function 24 times, you can submit the jobs for the function for the required number of times, and later acquire the result using a callback.

class TotalError:
    def __init__(self):
        self.value = 0

    def __call__(self, r):
        self.value += r.result()

total_error = TotalError()
with futures.ProcessPoolExecutor() as pool:
  for i in range(24):
    future_result = pool.submit(some_function_call, parameters1, parameters2)
    future_result.add_done_callback(total_error)

print(total_error.value)

score 10 · Answer 2 · edited Mar 17 '20 at 01:35

10

You can use python multiprocessing:

from multiprocessing import Pool, freeze_support, cpu_count
import os

all_args = [(parameters1, parameters2) for i in range(24)]

# call freeze_support() if in Windows
if os.name == "nt":
    freeze_support()

# you can use whatever, but your machine core count is usually a good choice (although maybe not the best)
pool = Pool(cpu_count()) 

def wrapped_some_function_call(args): 
    """
    we need to wrap the call to unpack the parameters 
    we build before as a tuple for being able to use pool.map
    """ 
    sume_function_call(*args) 

results = pool.map(wrapped_some_function_call, all_args)
total_error = sum(results)

edited Mar 17 '20 at 01:35

CarenRose

1,266
1
12
24

answered Jan 03 '18 at 17:37

Netwave

40,134
6
50
93

Why `def wrapped_some_function_call(args)` if you never call it? – pstatix Jan 03 '18 at 17:41
@pstatix, because i did not realize, thaks for pointing it :) – Netwave Jan 03 '18 at 17:42

Implement Parallel for loops in Python

2 Answers2