While working on a pet project of mine involving simultaneous downloads using the multiprocessing
module I encountered a curious behaviour involving Queue
objects generated by a multiprocessing.Manager()
object.
Depending on how I put a Queue object (generated through a Manager
) inside another Queue
object (also generated through a Manager
), I get a different behaviour for doing, to my understanding, the same thing.
Here's a minimum working example:
import multiprocessing
import Queue
def work(inbound_queue, keep_going):
while keep_going.value == 1:
try:
outbound_queue = inbound_queue.get(False) # this fails in case 3
#do some work
outbound_queue.put("work done!")
except Queue.Empty:
pass #this is busy wait of course, it's just an example
class Weird:
def __init__(self):
self.manager = multiprocessing.Manager()
self.queue = self.manager.Queue()
self.keep_going = multiprocessing.Value("i", 1)
self.worker = multiprocessing.Process(target = work, args = (self.queue, self.keep_going))
self.worker.start()
def stop(self): #close and join the second process
self.keep_going.value = 0
self.worker.join()
def queueFromOutside(self, q):
self.queue.put(q)
return q
def queueFromNewManager(self):
q = multiprocessing.Manager().Queue()
self.queue.put(q)
return q
def queueFromOwnManager(self):
q = self.manager.Queue()
self.queue.put(q)
return q
if __name__ == '__main__':
instance = Weird()
# CASE 1
queue = multiprocessing.Manager().Queue()
q1 = instance.queueFromOutside(queue) # Works fine
print "1: ", q1.get()
# CASE 2
q2 = instance.queueFromNewManager() # Works fine
print "2: ", q2.get()
# CASE 3
q3 = instance.queueFromOwnManager() # Error
print "3: ", q3.get()
instance.stop() #sadly never called :(
and its outputs (python 2.7.10 x86, windows).
OUTPUT for main:
1: work done!
2: work done!
3:
then the worker process crashes, leaving q3.get() hanging.
OUTPUT for worker process:
Process Process-2:
Traceback (most recent call last):
File "C:\Python27\lib\multiprocessing\process.py", line 258, in _bootstrap
self.run()
File "C:\Python27\lib\multiprocessing\process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "J:\Dropbox\Python\queues2.py", line 7, in work
outbound_queue = inbound_queue.get(False) # this fails in case 3
File "<string>", line 2, in get
File "C:\Python27\lib\multiprocessing\managers.py", line 774, in _callmethod
raise convert_to_error(kind, result)
RemoteError:
---------------------------------------------------------------------------
Unserializable message: ('#RETURN', <Queue.Queue instance at 0x025A22B0>)
---------------------------------------------------------------------------
So the question is: why does the 3rd case cause a RemoteError
?
The example provided does not resemble the structure of the code in the actual project, but I do send queues to running processes, and it's working fine if I do it with methods #1 and #2. It'd be nice to use method #3 though, since it saves the trouble of getting a Manager
every time, wich can take surprisingly long (~100 ms on the machine I'm working from right now).
The question comes out of curiosity, as I'm still learning about all the cool things in the multiprocessing
module.
UPDATE, trying to clarify the question: in case 3 (queueFromOwnManager
) why does self.manager.Queue()
create a queue that once put in self.queue
, cannot be retrieved with self.queue.get()
, while a queue created with multiprocessing.Manager().Queue()
can be retrieved? The order of the execution of the 3 cases does not matter. Ideally, instance.queue
will be empty before and after any of the 3 method calls in the 3 examples.
UPDATE 2: made the exaple more similar to what I'm actually doing in the code