17

Possible Duplicate:
how do I parallelize a simple python loop?

I'm quite new to Python (using Python 3.2) and I have a question concerning parallelisation. I have a for-loop that I wish to execute in parallel using "multiprocessing" in Python 3.2:

def computation:    
    global output

    for x in range(i,j):
        localResult = ... #perform some computation as a function of i and j
        output.append(localResult)

In total, I want to perform this computation for a range of i=0 to j=100. Thus I want to create a number of processes that each call the function "computation" with a subdomain of the total range. Any ideas of how do to this? Is there a better way than using multiprocessing?

More specific, I want to perform a domain decomposition and I have the following code:

from multiprocessing import Pool

class testModule:

    def __init__(self):
        self

    def computation(self, args):
        start, end = args
        print('start: ', start, ' end: ', end)

testMod = testModule()
length = 100
np=4
p = Pool(processes=np)
p.map(yes tMod.computation, [(length, startPosition, length//np) for startPosition in    range(0, length, length//np)]) 

I get an error message mentioning PicklingError. Any ideas what could be the problem here?

Community
  • 1
  • 1
user1499144
  • 1,063
  • 2
  • 9
  • 9
  • 1
    How much computation are you actually doing in the for loop? If it's a simple expression depending on i and j, the overhead of creating multiple processes/threads will far outweigh the benefits of doing the computation in parallel. – mgilson Jul 24 '12 at 13:06
  • It's quite heavy computation, involving calling several other functions. The loop is perfectly parallel so I definitely need to create multiple processes/threads. – user1499144 Jul 24 '12 at 13:08
  • 1
    A simple google search for `python parallelize for loop` would have led you to `joblib` in a matter of seconds. – phant0m Jul 24 '12 at 13:19
  • @SvenMarnach, I don't believe this to be a duplicate. OP is also asking if there's a simpler approach to his problem and options such as `joblib` are not covered in that previous question. – Louis Thibault Jul 24 '12 at 13:24
  • @blz: The fact that nobody mentioned `jiblib` in the answers to the other question hardly means it's not a duplicate – you could easily add the same answer there. – Sven Marnach Jul 24 '12 at 14:41

2 Answers2

20

Joblib is designed specifically to wrap around multiprocessing for the purposes of simple parallel looping. I suggest using that instead of grappling with multiprocessing directly.

The simple case looks something like this:

from joblib import Parallel, delayed
Parallel(n_jobs=2)(delayed(foo)(i**2) for i in range(10))  # n_jobs = number of processes

The syntax is simple once you understand it. We are using generator syntax in which delayed is used to call function foo with its arguments contained in the parentheses that follow.

In your case, you should either rewrite your for loop with generator syntax, or define another function (i.e. 'worker' function) to perform the operations of a single loop iteration and place that into the generator syntax of a call to Parallel.

In the later case, you would do something like:

Parallel(n_jobs=2)(delayed(foo)(parameters) for x in range(i,j))

where foo is a function you define to handle the body of your for loop. Note that you do not want to append to a list, since Parallel is returning a list anyway.

Louis Thibault
  • 20,240
  • 25
  • 83
  • 152
  • Good to hear! Would you mind sanity-checking my last example? I'm not entirely sure about calling `list.append` in this manner, and I haven't actually tested my code. Although it *should* work... method calls are no different from standalone functions, in theory. – Louis Thibault Jul 24 '12 at 13:18
  • 1
    @blz: it's not possible. Arguments to `delayed` must be picklable. Also, it doesn't make sense to do this, as `Parallel` will return a list anyway. – Fred Foo Jul 24 '12 at 13:26
  • @larsmans, Yikes... I need to sleep more. I shall edit. – Louis Thibault Jul 24 '12 at 13:27
  • 1
    You should provide a worker function that processes a single item: i.e. accepts `x` as parameter. The second argument to `delayed`is simply `x`. – phant0m Jul 24 '12 at 13:27
  • @phant0m, thanks! The edit has been made. I should probably go drink some coffee =/ – Louis Thibault Jul 24 '12 at 13:30
  • Wow, so simple and works well in IPython! It's amazing how hard this was to find. This is my third time coming back to the topic of *easy* Python multiprocessing, and the first time I thought "yep, this is the answer". – Dan Stahlke Sep 28 '13 at 21:38
  • Can joblib handle objects? I have been trying to do deepcopies of objects in a biological virus simulator, but I have not been able to do so using Joblib. Please see post: http://stackoverflow.com/questions/23788521/joblib-with-objects – ericmjl May 22 '14 at 21:00
6

In this case, you probably want to define a simple function to perform the calculation and get localResult.

def getLocalResult(args):
    """ Do whatever you want in this func.  
        The point is that it takes x,i,j and 
        returns localResult
    """
    x,i,j = args  #unpack args
    return doSomething(x,i,j)

Now in your computation function, you just create a pool of workers and map the local results:

import multiprocessing
def computation(np=4):
    """ np is number of processes to fork """
    p = multiprocessing.Pool(np)
    output = p.map(getLocalResults, [(x,i,j) for x in range(i,j)] )
    return output

I've removed the global here because it's unnecessary (globals are usually unnecessary). In your calling routine you should just do output.extend(computation(np=4)) or something similar.

EDIT

Here's a "working" example of your code:

from multiprocessing import Pool

def computation(args):
    length, startPosition, npoints = args
    print(args)

length = 100
np=4
p = Pool(processes=np)
p.map(computation, [(startPosition,startPosition+length//np, length//np) for startPosition in  range(0, length, length//np)])

Note that what you had didn't work because you were using an instance method as your function. multiprocessing starts new processes and sends the information between processes via pickle, therefore, only objects which can be pickled can be used. Note that it really doesn't make sense to use an instance method anyway. Each process is a copy of the parent, so any changes to state which happen in the processes do not propagate back to the parent anyway.

mgilson
  • 300,191
  • 65
  • 633
  • 696
  • Sorry for not being clear enough, but in the function getLocalResult, I need to know over which interval each process is working. Let's say the total interval is from 0 to 100, that I use 4 processes and that the local interval is given by iLocal and jLocal. Then for the first process, iLocal=0 and jLocal = 25, for the second process iLocal is 26 and jLocal is 50 and so on. Currently, I suppose that x will represent iLocal, but how can I know the end of the interval that the subprocess operates on? – user1499144 Jul 24 '12 at 13:48
  • @user1499144 -- Are you doing some sort of domain decomposition? If so, you could replace the list comprehension with something like: `[(i,100//np) for i in range(0,100,100//np)]` -- then sort have a `for` loop inside "getLocalResult". – mgilson Jul 24 '12 at 13:56
  • domain decomposition is exactly what I am trying to do! However, I still don't get it to work after having considered your comments, you can see my entire code in the answer below. – user1499144 Jul 24 '12 at 15:10
  • I edited the original question above so that you can see the piece of code that is not working. Thank's a lot for any input! – user1499144 Jul 24 '12 at 15:15