When you supply a large-enough object into multiprocessing.Queue
, the program seems to hang at weird places. Consider this minimal example:
import multiprocessing
def dump_dict(queue, size):
queue.put({x: x for x in range(size)})
print("Dump finished")
if __name__ == '__main__':
SIZE = int(1e5)
queue = multiprocessing.Queue()
process = multiprocessing.Process(target=dump_dict, args=(queue, SIZE))
print("Starting...")
process.start()
print("Joining...")
process.join()
print("Done")
print(len(queue.get()))
If the SIZE
parameter is small-enough (<= 1e4 at least in my case), the whole program runs smoothly without a problem, but once the SIZE
is big-enough, the program hangs at weird places. Now, when searching for explanation, i.e. python multiprocessing - process hangs on join for large queue, I have always seen general answers of "you need to consume from the queue". But what seems weird is that the program actually prints Dump finished
i.e. reaching the code line after putting the object into the queue
. Furthermore using Queue.put_nowait
instead of Queue.put
did not make a difference.
Finally if you use Process.join(1)
instead of Process.join()
the whole process finishes with complete dictionary in the queue (i.e. the print(len(..))
line will print 10000
).
Can somebody give me a little bit more insight into this?