It's actually not necessary to include the queues in the args
argument in this case, no matter what platform you're using. The reason is that even though it doesn't look like you're explicitly passing the two JoinableQueue
instances to the child, you actually are - via self
. Because self
is explicitly being passed to the child, and the two queues are a part of self
, they end up being passed along to the child.
On Linux, this happens via os.fork()
, which means that file descriptors used by the multiprocessing.connection.Connection
objects that the Queue
uses internally for inter-process communication are inherited by the child (not copied). Other parts of the Queue
become copy-on-write
, but that's ok; multiprocessing.Queue
is designed so that none of the pieces that need to be copied actually need to stay in sync between the two processes. In fact, many of the internal attributes get reset after the fork
occurs:
def _after_fork(self):
debug('Queue._after_fork()')
self._notempty = threading.Condition(threading.Lock())
self._buffer = collections.deque()
self._thread = None
self._jointhread = None
self._joincancelled = False
self._closed = False
self._close = None
self._send = self._writer.send # _writer is a
self._recv = self._reader.recv
self._poll = self._reader.poll
So that covers Linux. How about Windows? Windows doesn't have fork
, so it will need to pickle self
to send it to the child, and that includes pickling our Queues
. Now, normally if you try to pickle a multiprocessing.Queue
, it fails:
>>> import multiprocessing
>>> q = multiprocessing.Queue()
>>> import pickle
>>> pickle.dumps(q)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/pickle.py", line 1374, in dumps
Pickler(file, protocol).dump(obj)
File "/usr/local/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
File "/usr/local/lib/python2.7/pickle.py", line 306, in save
rv = reduce(self.proto)
File "/usr/local/lib/python2.7/copy_reg.py", line 84, in _reduce_ex
dict = getstate()
File "/usr/local/lib/python2.7/multiprocessing/queues.py", line 77, in __getstate__
assert_spawning(self)
File "/usr/local/lib/python2.7/multiprocessing/forking.py", line 52, in assert_spawning
' through inheritance' % type(self).__name__
RuntimeError: Queue objects should only be shared between processes through inheritance
But this is actually an artificial limitation. multiprocessing.Queue
objects can be pickled in some cases - how else could they be sent to child processes in Windows? And indeed, we can see that if we look at the implementation:
def __getstate__(self):
assert_spawning(self)
return (self._maxsize, self._reader, self._writer,
self._rlock, self._wlock, self._sem, self._opid)
def __setstate__(self, state):
(self._maxsize, self._reader, self._writer,
self._rlock, self._wlock, self._sem, self._opid) = state
self._after_fork()
__getstate__
, which is called when pickling an instance, has an assert_spawning
call in it, which makes sure we're actually spawning a process while attempting the pickle*. __setstate__
, which is called while unpickling, is responsible for calling _after_fork
.
So how are the Connection
objects used by the queues maintained when we have to pickle? It turns out there's a multiprocessing
sub-module that does exactly that - multiprocessing.reduction
. The comment at the top of the module states it pretty clearly:
#
# Module to allow connection and socket objects to be transferred
# between processes
#
On Windows, the module ultimately uses the DuplicateHandle API provided by Windows to create a duplicate handle that the child process' Connection
object can use. So while each process gets its own handle, they're exact duplicates - any action made on one is reflected on the other:
The duplicate handle refers to the same object as the original handle.
Therefore, any changes to the object are reflected through both
handles. For example, if you duplicate a file handle, the current file
position is always the same for both handles.
* See this answer for more information about assert_spawning