40

I want to run a bunch of jobs in parallel and then continue once all the jobs are finished. I've got something like

# based on example code from https://pymotw.com/2/multiprocessing/basics.html
import multiprocessing
import random
import time

def worker(num):
    """A job that runs for a random amount of time between 5 and 10 seconds."""
    time.sleep(random.randrange(5,11))
    print('Worker:' + str(num) + ' finished')
    return

if __name__ == '__main__':
    jobs = []
    for i in range(5):
        p = multiprocessing.Process(target=worker, args=(i,))
        jobs.append(p)
        p.start()

    # Iterate through the list of jobs and remove one that are finished, checking every second.
    while len(jobs) > 0:
        jobs = [job for job in jobs if job.is_alive()]
        time.sleep(1)

    print('*** All jobs finished ***')

it works, but I'm sure there must be a better way to wait for all the jobs to finish than iterating over them again and again until they are done.

Amir
  • 10,600
  • 9
  • 48
  • 75
Hybrid
  • 576
  • 3
  • 6
  • 10

2 Answers2

51

What about?

for job in jobs:
    job.join()

This blocks until the first process finishes, then the next one and so on. See more about join()

jayant
  • 2,349
  • 1
  • 19
  • 27
  • 8
    Note to future searchers: This usage can be indicative of a task that would benefit from a [Pool](https://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers). – kungphu Oct 20 '17 at 07:22
  • 2
    What will happen if the `job` is already completed and we call `join()` method on that object? – Anwar Shaikh Aug 19 '19 at 22:40
  • @EngineeredBrain It will return immediately – jayant Aug 20 '19 at 01:57
  • If I spawn a job from a different thread. (Timer Thread t creates Job j and starts it). And I finish the main thread, will the main thread also wait for job j to finish? And how about if t is a daemon thread. Can I still be sure that Job j will always finish, no matter if the parent threads have finished, even without calling .join? – AgentM Apr 06 '20 at 12:51
  • 1
    This is correct but can be dangerous if combined with multiprocessingQueue. If you try to append an end result to a queue, when the process in already joined, it will hang the parent process. – Lan Vukušič Aug 28 '20 at 14:09
  • I wonder how to use Process with several inputs like `pool.startmap()` does? – dexter2406 Feb 15 '21 at 22:36
  • The observation by @LanVukušič is important, especially when in a multi-core setting, the process hangs and it seems to hang at the join( ) call. The processes write to the multiprocessingQueue as expected but the join fails. – qboomerang Jul 04 '22 at 13:26
  • how can this be done without hanging the process? is there a way to make the processes self-destruct after putting their content onto the queue? – aloea Aug 09 '22 at 14:28
7

You can make use of join. It let you wait for another process to end.

t1 = Process(target=f, args=(x,))
t2 = Process(target=f, args=('bob',))

t1.start()
t2.start()

t1.join()
t2.join()

You can also use barrier It works as for threads, letting you specify a number of process you want to wait on and once this number is reached the barrier free them. Here client and server are asumed to be spawn as Process.

b = Barrier(2, timeout=5)

def server():
    start_server()
    b.wait()
    while True:
        connection = accept_connection()
        process_server_connection(connection)

def client():
    b.wait()
    while True:
        connection = make_connection()
        process_client_connection(connection)

And if you want more functionalities like sharing data and more flow control you can use a manager.

Rbtnk
  • 163
  • 1
  • 2
  • 11