0

I need to make script which on some condition spawns parallel proccess (worker) and makes it to do some IO job. And when it finished - close that process. But looks like the processes do not tend co exit by default.

Here is my approach:

import multiprocessing

pool = multiprocessing.Pool(4)

def f(x):                        
    sleep(10)
    print(x)
    return True

r = pool.map_async(f, [1,2,3,4,5,6,7,8,9,10])

But it I run it in the ipython and whait for all prints, after this I can run ps aux | grep ipython and see a lot of processes. So looks like these workers are still alive.

Maybe I'm doind something wrong, but how can I get make these processes terminate when they finished their task? And what approach should I use if I want to spawn a lot of workers one by one (by getting some rmq message, for example)?

Paul
  • 6,641
  • 8
  • 41
  • 56

1 Answers1

1

Pool spawns worker processes when you declare the pool. They do not get killed until the pool is shut down. Instead, they wait there for more work to appear in the queue.

If you change your code to:

r = pool.map_async(f, [1,2,3,4,5,6,7,8,9,10])
pool.close()
pool.join()
print "check ps ax now"
sleep (10)

you will see the pool processes have disappeared.

Another thing, your program might not work as intended as you declare function f after you declare your pool. I had to change pool = multiprocessing.Pool(4) to follow function f declaration, but this may vary between Python versions. Anyway, if you get odd "module has no attribute" -exceptions, this is the reason.

Hannu

Hannu
  • 11,685
  • 4
  • 35
  • 51
  • yes, I got that exception so I moved pool declaration. Thanks! Is this approach ok to spawn proccesses one by one asynchronously? – Paul Dec 01 '16 at 14:20
  • Processes are created already at pool = multiprocessing.Pool(4). It does not require map_async or any other job submission command to launch them. The idea of a pool is to have the processes ready to process whatever tasks you decide to feed them, and processes terminate only when you close the pool and processing of the last tasks have finished. Using map_async in the way you use it to process a list of data is completely fine. This is exactly how it is meant to be used. – Hannu Dec 01 '16 at 14:27
  • thanks! :) I've asked another question here: http://stackoverflow.com/questions/40913207/return-value-from-spawned-multiprocessing-process You might be interested. :) – Paul Dec 01 '16 at 14:32