8

I am using multiprocessing module, and using pools to start multiple workers. But the file descriptors which are opened at the parent process are closed in the worker processes. I want them to be open..! Is there any way to pass file descriptors to be shared across parent and children?

kumar
  • 2,696
  • 3
  • 26
  • 34
  • As mentioned, you'll need to use OS-specific features. Which platforms are you interested in supporting? – Rakis Jun 07 '10 at 13:38
  • I need to support Windows and linux so I dont want to use any OS specific features. In Linux , file handles are shared by default and windows also has an option to share the file handles during CreateProcess()... I dont know why multiprocessing modules doesn't have an extra options to share file handles. – kumar Jun 07 '10 at 13:40
  • 1
    As Windows & Linux differ in semantics of passing file handles, you're probably going to *have* to use OS specific features. No problem there though, it's easy to tell the difference from `sys.platform` and just call an OS-specific "make it work for this OS" function. I suggest reading through the multiprocessing module's code to see if there's an easy work-around. – Rakis Jun 07 '10 at 13:54
  • Can you please explain how did you figured out they are closed? From what I've been read passing file descriptors among processes simply isn't working (I've not found closer explanation) how do you know that descriptor is closed and e.g. not passed or something else? Thank you – Wakan Tanka Sep 07 '15 at 13:10

4 Answers4

9

On Python 2 and Python 3, functions for sending and receiving file descriptors exist in multiprocessing.reduction module.

Example code (Python 2 and Python 3):

import multiprocessing
import os

# Before fork
child_pipe, parent_pipe = multiprocessing.Pipe(duplex=True)

child_pid = os.fork()

if child_pid:
    # Inside parent process
    import multiprocessing.reduction
    import socket
    # has socket_to_pass socket object which want to pass to the child
    socket_to_pass = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
    socket_to_pass.connect("/dev/log")
    # child_pid argument to send_handle() can be arbitrary on Unix,
    # on Windows it has to be child PID
    multiprocessing.reduction.send_handle(parent_pipe, socket_to_pass.fileno(), child_pid)
    socket_to_pass.send("hello from the parent process\n".encode())
else:
    # Inside child process
    import multiprocessing.reduction
    import socket
    import os
    fd = multiprocessing.reduction.recv_handle(child_pipe)
    # rebuild the socket object from fd
    received_socket = socket.fromfd(fd, socket.AF_INET, socket.SOCK_STREAM)
    # socket.fromfd() duplicates fd, so we can close the received one
    os.close(fd)
    # and now you can communicate using the received socket
    received_socket.send("hello from the child process\n".encode())
Piotr Jurkiewicz
  • 1,653
  • 21
  • 25
  • 1
    `send_handle` and `recv_handle` also exist in Python 2 as I mention in my answer [here](https://stackoverflow.com/a/54047120/1698058), no real need to distinguish between the two. Also, I would not use sending a socket as an example since `multiprocessing` registers `socket.socket` with their custom Pickler - a `socket.socket` object can be sent over a `Connection` directly without any of the code above but if left as-is then someone might copy it into their own code base. – Chris Hunt Jan 05 '19 at 00:52
  • OK, I see that it was backported to Python 2 as well. As for sending `socket.socket` without additional code, it doesn't seem to work on arbitrary pipes/sockets, only on those created with the whole `multiprocessing.Manager` machinery. – Piotr Jurkiewicz Jan 05 '19 at 02:03
3

There is also a fork of multiprocessing called multiprocess, which replaces pickle with dill. dill can pickle file descriptors, and thus multiprocess can easily pass them between processes.

>>> f = open('test.txt', 'w')
>>> _ = map(f.write, 'hello world')
>>> f.close()
>>> import multiprocess as mp
>>> p = mp.Pool()
>>> f = open('test.txt', 'r')
>>> p.apply(lambda x:x, f)
'hello world'
>>> f.read()
'hello world'
>>> f.close()
Mike McKerns
  • 33,715
  • 8
  • 119
  • 139
2

multiprocessing itself has helper methods for transferring file descriptors between processes on Windows and Unix platforms that support sending file descriptors over Unix domain sockets in multiprocessing.reduction: send_handle and recv_handle. These are not documented but are in the module's __all__ so it may be safe to assume they are part of the public API. From the source it looks like these have been available since at least 2.6+ and 3.3+.

All platforms have the same interface:

  • send_handle(conn, handle, destination_pid)
  • recv_handle(conn)

Where:

  • conn (multiprocessing.Connection): connection over which to send the file descriptor
  • handle (int): integer referring to file descriptor/handle
  • destination_pid (int): integer pid of the process that is receiving the file descriptor - this is currently only used on Windows
Chris Hunt
  • 3,840
  • 3
  • 30
  • 46
0

There isn't a way that I know of to share file descriptors between processes. If a way exists, it is most likely OS specific.

My guess is that you need to share data on another level.

Mattias Nilsson
  • 3,639
  • 1
  • 22
  • 29
  • 4
    Agreed. There are OS-specific ways though. – unbeli Jun 07 '10 at 13:23
  • 1
    Yeah, I know that fork() for example will duplicate file descriptors, but is there an easy way to do it _after_ the processes have started? – Mattias Nilsson Jun 07 '10 at 14:15
  • 1
    Yep: http://stackoverflow.com/questions/909064/portable-way-to-pass-file-descriptor-between-different-processes – Rakis Jun 07 '10 at 14:36
  • 1
    `multiprocessing` itself has functions that handle the OS-specific details, see [here](https://stackoverflow.com/a/54047120/1698058). – Chris Hunt Jan 04 '19 at 22:40