multiprocessing.Manager() hangs Popen.communicate() on Python

Question

The use of multiprocessing.Manager prevents clean termination of Python child process using subprocess.Process.Popen.terminate() and subprocess.Process.Popen.kill().

This seems to be because Manager creates a child process behind the scenes for communicating, but this process does not know how to clean itself up when the parent is terminated.

What is the easiest way to use multiprocessing.Manager so that it does not prevent a process shutdown by a signal?

A demostration:

"""Multiprocess manager hang test."""
import multiprocessing
import subprocess
import sys
import time


def launch_and_read_process():
    proc = subprocess.Popen(
        [
            "python",
            sys.argv[0],
            "run_unkillable"
        ],
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
    )

    # Give time for the process to run and print()
    time.sleep(3)

    status = proc.poll()
    print("poll() is", status)

    print("Terminating")
    assert proc.returncode is None
    proc.terminate()
    exit_code = proc.wait()
    print("Got exit code", exit_code)
    stdout, stderr = proc.communicate()
    print("Got output", stdout.decode("utf-8"))


def run_unkillable():
    # Disable manager creation to make the code run correctly
    manager = multiprocessing.Manager()
    d = manager.dict()
    d["foo"] = "bar"
    print("This is an example output", flush=True)
    time.sleep(999)


def main():
    mode = sys.argv[1]
    print("Doing subrouting", mode)
    func = globals().get(mode)
    func()


if __name__ == "__main__":
    main()

Run as python test-script.py launch_and_read_process.

Good output (no multiprocessing.Manager):



    Doing subrouting launch_and_read_process
    poll() is None
    Terminating
    Got exit code -15
    Got output Doing subrouting run_unkillable
    This is an example output

Output when subprocess.Popen.communicate hangs because use of Manager:

    Doing subrouting launch_and_read_process
    poll() is None
    Terminating
    Got exit code -15

Charchit Agarwal · Answer 1 · 2023-02-08T23:30:50.643

Like you pointed out, this happens because the manager spawns its own child processes. So when you do proc.communicate() the code hangs because that child process's stderr and stdout are still open. You can easily solve this on Unix by setting your own handlers for SIGTERM and SIGINT, but it becomes a little hairy on Windows since those two signals are pretty much useless. Also, keep in mind that signals are only delivered to the main thread. Depending on your OS and the signal, if the thread is busy (time.sleep(999)) then the whole timer may need to run out before the signal can be intercepted. Anyway, I have provided a solution for both Windows and Unix with a note at the end:

UNIX

This is pretty straightforward, you simply define your own handlers for the signals where you explicitly call manager.shutdown() to properly cleanup its child process:

def handler(manager, *args):
    """
    Our handler, use functools.partial to fix arg manager (or you 
    can create a factory function too)
    """
    manager.shutdown()
    sys.exit()

def run_unkillable():

    # Disable manager creation to make the code run correctly
    manager = multiprocessing.Manager()

    # Register our handler,
    h = functools.partial(handler, manager)
    signal.signal(signal.SIGINT, h)
    signal.signal(signal.SIGTERM, h)

    d = manager.dict()
    d["foo"] = "bar"
    print("This is an example output", flush=True)
    time.sleep(999)

Windows

On Windows you will need to explicitly send the signal signal.CTRL_BREAK_EVENT rather than the plain proc.terminate() to ensure that the child process intercepts it (reference). Additionally, you'll also want to sleep in shorter durations in a loop instead of doing sleep(999) to make sure the signal interrupts the main thread rather than waiting for the whole duration of sleep (see this question for alternatives).

"""Multiprocess manager hang test."""
import functools
import multiprocessing
import subprocess
import sys
import time
import signal


def launch_and_read_process():
    proc = subprocess.Popen(
        [
            "python",
            sys.argv[0],
            "run_unkillable"
        ],

        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        creationflags=subprocess.CREATE_NEW_PROCESS_GROUP  # So that our current process does not get SIGBREAK signal
    )

    # Give time for the process to run and print()
    time.sleep(5)

    status = proc.poll()
    print("poll() is", status)

    print("Terminating")
    assert proc.returncode is None

    # Send this specific signal instead of doing terminate()
    proc.send_signal(signal.CTRL_BREAK_EVENT)

    exit_code = proc.wait()
    print("Got exit code", exit_code)
    stdout, stderr = proc.communicate()
    print("Got output", stdout.decode("utf-8"))


def handler(manager, *args):
    """
    Our handler, use functools.partial to fix arg manager (or you
    can create a factory function too)
    """
    manager.shutdown()
    sys.exit()


def run_unkillable():

    # Disable manager creation to make the code run correctly
    manager = multiprocessing.Manager()

    # Register our handler,
    signal.signal(signal.SIGBREAK, functools.partial(handler, manager))

    d = manager.dict()
    d["foo"] = "bar"
    print("This is an example output", flush=True)

    # Sleep in a loop otherwise the signal won't interrupt the main thread
    for _ in range(999):
        time.sleep(1)


def main():
    mode = sys.argv[1]
    print("Doing subrouting", mode)
    func = globals().get(mode)
    func()


if __name__ == "__main__":
    main()

Note: Keep in mind that there is a race condition in the above solution because we are registering the signal handler after the creation of a manager. Theoretically, one could kill the process right before the handler is registered and the proc.communicate() will then hang because the manager was not cleaned up. So you may want to supply a timeout parameter to .communicate with error handling to log these edge cases.

multiprocessing.Manager() hangs Popen.communicate() on Python

1 Answers1