1

I'm trying to figure out how I can speed up a grid search for a function that takes 2 input (a lower threshold value and an upper threshold value) in Python by using multiprocessing. I checked http://pymotw.com/2/multiprocessing/basics.html and tried using the following:

import multiprocessing

lower_threshold = range(0,100)
upper_threshold = range(100,200)

def worker():
    bestAccuracy = 0
    bestLowerThreshold = 0
    bestUpperThreshold = 0    
    for lowerThreshold in lower_threshold:
        for upperThreshold in upper_threshold:
            accuracy = someOtherFunction(lower_threshold,upper_threshold)
            if accuracy > bestAccuracy:
                bestAccuracy = accuracy
                bestLowerTheshold = lowerThreshold
                bestUpperThreshold = upperThreshold
    return bestAccuracy, bestLowerThreshold, bestUpperThreshold

if __name__ == '__main__':
    jobs = []
    for i in range(8):
        p = multiprocessing.Process(target=worker)
        jobs.append(p)
        p.start()

When I run this code, then I end up having my 8 core processes that run the same parameters/input for my function. Is there some way that I can specify that the the cores processes should use different parameters? In advance, thanks for your help.

  • Your `worker` function doesn't take any parameters, so you're just running the exact same computations in parallel. Find a way to partition your search space into 8 parts (maybe make `lower_threshold` your parameter and give each worker a different portion of that range). From there you only have to find the best solution of the 8. – Blender Aug 07 '14 at 00:05

2 Answers2

3

You can use itertools.product to create an iterator of 2-element tuples containing all the (lowerThreshHold, upperThreshold) pairs in your nested for loops, and then split that iterator up into eight chunks, passing one to each of the processes in your pool. Then you just sort the results you get from each worker to pick the best overall result. I'd also recommend using multiprocessing.Pool instead of multiprocessing.Process, to simplify things.

import multiprocessing
import itertools

def get_chunks(iterable, chunks=1):
    # This is from http://stackoverflow.com/a/2136090/2073595
    lst = list(iterable)
    return [lst[i::chunks] for i in xrange(chunks)]

def someOtherFunction(lower, upper):
    return (lower + upper) / 2

def worker(pairs):
    best_accuracy = best_lower = best_upper = 0
    for lower_threshold, upper_threshold in pairs:
        accuracy = someOtherFunction(lower_threshold, upper_threshold)
        if accuracy > best_accuracy:
            best_accuracy = accuracy
            best_lower = lower_threshold
            best_upper = upper_threshold
    return best_accuracy, best_lower, best_upper

if __name__ == '__main__':
    jobs = []
    pairs = itertools.product(xrange(0, 100), xrange(100, 200))
    chunked_pairs = get_chunks(pairs, chunks=multiprocessing.cpu_count())
    pool = multiprocessing.Pool()
    results = pool.map(worker, chunked_pairs)
    pool.close()
    pool.join()
    print results
    # Now combine the results
    sorted_results = reversed(sorted(results, key=lambda x: x[0]))
    print next(sorted_results)  # Winner

Output:

[(147, 98, 196), (148, 99, 197), (148, 98, 198), (149, 99, 199)]
(149, 99, 199)
dano
  • 91,354
  • 19
  • 222
  • 219
  • Thank you very much for the help. I'm not quite sure about the following, so I'll just ask: What would a possible solution look like if my function required 4 threshold value parameters instead of 2? – sentimentMining Aug 07 '14 at 02:10
  • `pairs = itertools.product(xrange(...), xrange(...))` would become `groups = itertools.product(xrange(...), xrange(...), xrange(...), xrange(...))`. And `for lower_threshold, upper_threshold in pairs:` would becomes `for thrshld1, thrshld2, thrshl13, trhshld4 in group:`. And of course you'd need to save the values for each `best_thrshld*` as you iterate, and return them all. – dano Aug 07 '14 at 02:14
  • Thanks. Would it be possible to have the worker function take more arguments than 'pairs' (in this case I was thinking of arguments that would be used as arguments when someOtherFunction is called within the body of the worker function)? If yes: How could this be done? – sentimentMining Aug 19 '14 at 14:08
  • @sentimentMining If you're talking about arguments that would be the same for every set of `pairs`, it can be done using [`functools.partial`](https://docs.python.org/2.7/library/functools.html#functools.partial). `func = partial(worker, arg1, arg2)` ; `pool.map(func, chunked_pairs)`. And the definition of `worker` would become: `def worker(arg1, arg2, pairs):` – dano Aug 19 '14 at 14:15
0

Generally multiprocessing works best if you have a pool of several workers, and a longish list of inputs. Each batch of inputs gets processed in parallel, and results are returned to the caller. Then the pool runs the next batch, and so on until all the inputs are consumed.

The following code passes in the lower_threshold range, and each one re-uses the upper_threshold:

source

import multiprocessing, random

upper_threshold = range(10,20)

def someOtherFunction(a,b):
    print 'ding:',a,b
    return random.randint(10, 100)

def worker(thresh):
    bestAccuracy = 0
    bestLowerThreshold = 0
    bestUpperThreshold = 0    
    for upperThreshold in upper_threshold:
        accuracy = someOtherFunction(thresh, upper_threshold)
        if accuracy > bestAccuracy:
            bestAccuracy = accuracy
            bestLowerTheshold = thresh
            bestUpperThreshold = upperThreshold
    return bestAccuracy, bestLowerThreshold, bestUpperThreshold

if __name__ == '__main__':
    lower_threshold = range(0,10)
    pool = multiprocessing.Pool()
    print pool.map( worker, [[th] for th in lower_threshold] )
    pool.close()
    pool.join()

output

ding: [0] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [0] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [0] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [0] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [0] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [0] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [0] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [0] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [0] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [0] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [4] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [4] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [4] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [4] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [4] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [4] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [4] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [4] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [4] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [4] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [6] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [6] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [6] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [6] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [6] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [6] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [6] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [6] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [6] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [6] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [1] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [1] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [1] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [1] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [1] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [1] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [1] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [1] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [1] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [1] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [5] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [5] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [5] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [5] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [5] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [5] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [5] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [5] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [5] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [5] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [7] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [7] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [7] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [7] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [7] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [7] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [7] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [7] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [7] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [7] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [3] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [3] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [3] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [3] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [3] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [3] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [3] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [3] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [3] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [3] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [9] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [9] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [9] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [9] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [9] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [9] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [9] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [9] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [9] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [9] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [2] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [2] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [2] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [2] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [2] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [2] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [2] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [2] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [2] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [2] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [8] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [8] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [8] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [8] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [8] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [8] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [8] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [8] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [8] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
ding: [8] [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
[(80, 0, 11), (98, 0, 17), (96, 0, 17), (89, 0, 17), (96, 0, 17), (95, 0, 10), (95, 0, 18), (100, 0, 18), (90, 0, 18), (97, 0, 17)]
johntellsall
  • 14,394
  • 4
  • 46
  • 40
  • Thank you very much for the help. I'm not quite sure about the following, so I'll just ask: What would a possible solution look like if my function required 4 threshold value parameters instead of 2? – sentimentMining Aug 07 '14 at 02:09
  • it would look the same: unroll one loop to produce a stream of data. Each worker takes one value from the stream, then explicitly loops over the other three parameters. The given code unrolled the `lower_threshold` loop, then each worker did the upper_threshold loop directly. – johntellsall Aug 07 '14 at 19:24