When you pass arguments to a Process
, they are pickled in the parent, transmitted to the child, and unpickled there. Unfortunately, it looks like the round trip through pickle
silently misbehaves for file objects; with protocol 0, it errors out, but with protocol 2 (the highest Python 2 protocol, and the one used for multiprocessing
), it silently produces a junk file object:
>>> import pickle, sys
>>> pickle.loads(pickle.dumps(sys.stdout, pickle.HIGHEST_PROTOCOL))
<closed file '<uninitialized file>', mode '<uninitialized file>' at 0xDEADBEEF>
Same problem occurs for named files too; it's not unique to the standard handles. Basically, pickle
can't round trip a file object; even when it claims to succeed, the result is garbage.
Generally, multiprocessing
isn't really expected to handle a scenario like this; usually, Process
es are worker tasks, and I/O is performed through the main process (because if they all wrote independently to the same file handle, you'd have issues with interleaved writes).
In Python 3.5 at least, they fixed this so the error is immediate and obvious (the file-like objects returned by open
, TextIOWrapper
and Buffered*
, will error out when pickled with any protocol).
The best you could do on Windows would be to send the known file descriptor as an argument:
sys.stdout.flush() # Precaution to minimize output interleaving
w = multiprocessing.Process(target=worker_with, args=(sys.stdout.fileno(),))
then reopen it on the other side using os.fdopen
. For fd
s not part of the standard handles (0
, 1
and 2
), since Windows uses the "spawn" method of making new Process
es, you'd need to make sure any such fd
was opened as a consequence of import
ing the __main__
module when __name__ != "__main__"
(Windows simulates a fork
by importing the __main__
module, setting the __name__
to something else). Of course, if it's a named file, not a standard handle, you could just pass the name and reopen that. For example, to make this work, you'd change:
def worker_with(stream):
stream.write('In the process\n')
to:
import os
def worker_with(toopen):
opener = open if isinstance(toopen, basestring) else os.fdopen
with opener(toopen, 'a') as stream:
stream.write('In the process\n')
Note: As written, if the fd
is for one of the standard handles, os.fdopen
will close the underlying file descriptor when the with
statement exits, which may not be what you want. If you need file descriptors to survive the close of the with
block, when passed a file descriptor, you may want to use os.dup
to duplicate the handle before calling os.fdopen
, so the two handles are independent of one another.
Other solutions would include writing results back to the main process over a multiprocessing.Pipe
(so the main process is responsible for passing the data along to sys.stdout
, possibly launching a thread to perform this work asynchronously), or using higher level constructs (e.g. multiprocessing.Pool().*map*
) that return data using return
statement instead of explicit file I/O.
If you're really desperate to make this work in general for all file descriptors (and don't care about portability), not just the standard handles and descriptors created on import
of __main__
, you can use the undocumented Windows utility function multiprocessing.forking.duplicate
that is used to explicitly duplicate a file descriptor from one process to another; it would be incredibly hacky (you'd need to look at the rest of the Windows definition of multiprocessing.forking.Popen
there to see how it would be used), but it would at least allow passing along arbitrary file descriptors, not just statically opened ones.