0

Why program A deadlock but program B runs fine?

Both uses a queue to pass result between main process and sub-processes. Both wait for all sub-processes to end before retrieving the results from the queue.

In the actual program, I need a queue to pass streams of results between steps. This is for working on results when the results are generated and before one of the step completes.

Changing the mp.queue in program A to a queue of a mp.Manager would fix the deadlock. But using manager seems to have performance penalty, because the manager is managed by the main process.

Program A (deadlocks without manager):

import multiprocessing as mp
def worker(q, i): 
    q.put(i)

if __name__ == "__main__":
    q = mp.Queue()

    # Start sub-processes
    p = mp.Pool()
    for i in range(4):
        p.apply_async(worker, args=(q, i)) 

    # wait till all workers complete their task.
    p.close()
    p.join()

    # Get results from queue
    for i in range(4):
        print(q.get())

Program B is my simplification of a program at (https://docs.python.org/3/howto/logging-cookbook.html) below the line "basis for code meeting your own specific requirements", and it doesn't use a manager ...

Program B:

import multiprocessing as mp
def worker(q, i):
    q.put(i)

if __name__  == "__main__":
    q = mp.Queue()

    # Start sub-processes
    ws = []
    for i in range(4):
        w = mp.Process(target=worker, args=(q, i))
        ws.append(w)
        w.start()

    # wait till all workers complete their task.
    for w in ws:
        w.join()

    # Get results from queue
    for i in range(4):
        print(q.get())
hamster on wheels
  • 2,771
  • 17
  • 50
  • congratulations: you seem to have triggered undefined behaviour using python. the docs advise to use a Manager, that should be self explanatory why you must use it. – Jean-François Fabre Jul 27 '17 at 09:42
  • try replacing `p.apply_async` by `p.apply` and Python will recommend you to use a manager. I think `apply_async` is the most sensitive function. Won't work at all without a manager. – Jean-François Fabre Jul 27 '17 at 09:43
  • queue occasionally has corrupted data without manager. i guess shared memory is the way to go: https://stackoverflow.com/questions/7894791/use-numpy-array-in-shared-memory-for-multiprocessing – hamster on wheels Jul 27 '17 at 11:24

0 Answers0