3

I have a program in which I would like to run one of the functions in parallel for several arguments.

the program is in the following format:

import statements 

def function1():
     do something

def function2()
     do something

def main():

     function1()

I found several examples of how to use the multiprocessing library online such as the following general template

import multiprocessing

def worker(num):
    print 'Worker:', num
    return

if __name__ == '__main__':
    for i in range(5):
        p = multiprocessing.Process(target=worker, args=(i,))
        p.start()

From what I understand the worker() is the function which is meant to be performed in parallel. But I am not sure where or how to use the (if __name__ == '__main__':) block the code.

as of now that block is in my main() and when I run the program I do not get the worker function executed multiple times, instead my main gets executed multiple times?

So where is the proper place to put the (if __name__ == '__main__':) block

linusg
  • 6,289
  • 4
  • 28
  • 78
Mustard Tiger
  • 3,520
  • 8
  • 43
  • 68
  • 2
    Put the parallelization logic (the calls to `multiprocessing`) in the `main` function. Then, call `main` from within `if __name__ == '__main__'` – inspectorG4dget May 03 '16 at 17:47
  • You need to show the actual code you're running. Since nothing whatsoever in the code you _showed_ calls your `main()` function, it's impossible for us to guess why it's _ever_ called. But if you follow the template you posted second instead, your code will work. – Tim Peters May 03 '16 at 17:47

1 Answers1

2

Blending together the two examples you provide, it would look like this:

import multiprocessing

def worker(num):
    print 'Worker:', num
    return

def main():

    for i in range(5):
        p = multiprocessing.Process(target=worker, args=(i,))
        p.start()
        p.join()

if __name__ == '__main__':
     main()

Replace worker with function1, i.e. whichever you'd like to parallelise.

The key part is calling that main function in the if __name__ == '__main__': block, however in this simple example you could as easily put the code under def main(): under if __name__ == '__main__': directly.

If you're never going to import anything from this file, you don't even need the if __name__ == '__main__': part; this is only required if you want to be able to import functions from this script into other scripts/an interactive session without running the code in main(). See What does if __name__ == "__main__": do?.

So the simplest usage would be:

import multiprocessing

def worker(num):
    print 'Worker:', num
    return

for i in range(5):
    p = multiprocessing.Process(target=worker, args=(i,))
    p.start()
    p.join()

Edit: multiprocessing pool example

import multiprocessing

def worker(num):
    #print 'Worker:', num
    return num

pool = multiprocessing.Pool(multiprocessing.cpu_count())

result = pool.imap(worker, range(5))

print list(result)

Prints:

[0, 1, 2, 3, 4]

See also Python multiprocessing.Pool: when to use apply, apply_async or map? for more detailed explanations.

Community
  • 1
  • 1
Harry
  • 3,312
  • 4
  • 34
  • 39
  • when i run the first version, it works but it does not give me the expected output, of worker: 0, worker:1, worker:2... instead the order varies every time i run it, sometimes it's (1,2,0,4,3) or (0,3,2,1,4) Also the second version you provided without if __name__ == '__main__': runs forever when i try it – Mustard Tiger May 03 '16 at 20:22
  • The order varies because `multiprocessing` is splitting out each call to `worker` as separate processes across multiple cores, without caring about the order in which the processes are called, finish, and return. The results come back in the order that they finish, rather than the order in which they were called. If you want to retain the order, take a look at `multiprocessing.pool` and `apply_async` or `imap` https://docs.python.org/2/library/multiprocessing.html#examples. – Harry May 04 '16 at 09:39
  • For the second version, it may well be that if it's not in a function block, `.join()` is required. See https://docs.python.org/2/library/multiprocessing.html#process-and-exceptions – Harry May 04 '16 at 09:39