14

I would like to create a parent process, which will create many child process. Since the parent process is responsible to create the child process, the parent process would not care the status of the childs.

Since subprocess.call is blocking, it doesn't work. Therefore I use subprocess.Popen to replace call. However Popen will generate zombie (defunct) process once the child terminate(Link).

Is there a way to solve this problem?

Asclepius
  • 57,944
  • 17
  • 167
  • 143
Winston
  • 1,308
  • 5
  • 16
  • 34
  • Two things, sounds to me like you are confusing parent and child. And the other question, why would you want to create zombies? – Blubber May 29 '13 at 07:12
  • 1
    I wouldn't want zombie. It's just a side effect of Popen which I would like to avoid. – Winston May 29 '13 at 07:16
  • So you want to spawn N processes, let them execute their stuff in parallel, and then block the parent until they are all done? – Blubber May 29 '13 at 07:17
  • Actually parent just create child. Child will die at anytime and parent doesn't care. So it's not blocking. – Winston May 29 '13 at 07:18
  • But this is still too non-specific. What is the parent process? Will it continue running while the children run? Or is it just a utilty that spawns N processes, then exits and lets the children run? In case of the latter take a look at http://stackoverflow.com/questions/5772873/python-spawn-off-a-child-subprocess-detach-and-exit – Blubber May 29 '13 at 07:21
  • Sorry for not specific. The parent will run forever and it will create child process in any random time. The child will die at any random time and parent doesn't care about the child. – Winston May 29 '13 at 07:27

4 Answers4

20

There are a lot of ways to deal with this. The key point is that zombie / "defunct" processes exist so that the parent process can collect their statuses.

  1. As the creator of the process, you can announce your intent to ignore the status. The POSIX method is to set the flag SA_NOCLDWAIT (using sigaction). This is a bit of a pain to do in Python; but most Unix-like systems allow you to simply ignore SIGCHLD / SIGCLD (the spelling varies from one Unix-like system to another), which is easy to do in Python:

    import signal

    signal.signal(signal.SIGCHLD, signal.SIG_IGN)

  2. Or, if this is not available for some reason or does not work on your system, you can use an old stand-by trick: don't just fork once, fork twice. In the first child, fork a second child; in the second child, use execve (or similar) to run the desired program; and then in the first child, exit (with _exit). In the original parent, use wait or waidpid or whatever the OS provides, and collect the status of the first child.

    The reason this works is that the second child has now become an "orphan" (its parent, the first child, died and was collected by your original process). As an orphan it is handed over to a proxy parent (specifically, to "init") which is always wait-ing and hence collects all the zombies right away.

  3. In addition to the double fork, you can make your sub-processes live in their own separate session and/or give up controlling terminal access ("daemonize", in Unix-y terms). (This is a bit messy and OS-dependent; I've coded it before but for some corporate code I don't have access to now.)

  4. Finally, you could simply collect those processes periodically. If you're using the subprocess module, simply call the .poll function on each process, whenever it seems convenient. This will return None if the process is still running, and the exit status (having collected it) if it has finished. If some are still running, your main program can exit anyway while they keep running; at that point, they become orphaned, as in method #2 above.

The "ignore SIGCHLD" method is simple and easy but has the drawback of interfering with library routines that create and wait-for sub-processes. There's a work-around in Python 2.7 and later (http://bugs.python.org/issue15756) but it means the library routines can't see any failures in those sub-processes.

[Edit: http://bugs.python.org/issue1731717 is for p.wait(), where p is a process from subprocess.Popen; 15756 is specifically for p.poll(); but in any case if you don't have the fixes, you have to resort to methods 2, 3, or 4.]

torek
  • 448,244
  • 59
  • 642
  • 775
  • Thanks I tried signal.signal but it would cause `OSError: [Errno 10] No child processes` when I call `subprocess.call` – Winston May 29 '13 at 09:41
  • That means you don't have the work-around (the bugs.python.org link above). `subprocess`'s `call` consists of a `Popen` followed by a `wait`, and you've told the OS to throw away child status-es, so the underlying `os.wait` fails with `errno.ECHILD`. (The work-around just treats this as meaning "subprocess exit code = 0".) – torek May 29 '13 at 10:05
  • 3
    the signal.signal(signal.SIGCHLD, signal.SIG_IGN) technique worked for me – Alix Martin Jan 13 '15 at 08:07
  • Thanks, signal.signal(signal.SIGCHLD, signal.SIG_IGN) worked for me on python 2.7.10 running in a docker container – phansen Dec 03 '15 at 23:04
  • Although `signal.signal(signal.SIGCHLD, signal.SIG_IGN)` works, it may be unsafe to use with `multiprocessing`, etc. Its use is not specific to a particular child process. – Asclepius Jun 09 '23 at 02:26
3

After terminating or killing a process the operating system waits for the parent process to collect the child process status. You can use the process' communicate() method to collect the status:

p = subprocess.Popen( ... )
p.terminate()
p.communicate()

Note that terminating a process allows the process to intercept the terminate signal and do whatever it wants to do with it. This is crucial since p.communicate() is a blocking call.

If you do not wish this behavior use p.kill() instead of p.terminate() which lets the process not intercept the signal.

If you want to use p.terminate() and be sure the process ended itself you can use the psutil module to check on the process status.

Waschbaer
  • 447
  • 6
  • 18
  • This seems to be the correct, and simplest, solution. It worked for me, in a case where I was piping one subprocess into another, and the first subprocess could hang. – Jason Drew Feb 20 '23 at 21:22
0

torek's methods is ok!

I found another way to deal with defunct process;

we can use waitpid to recycle defunct process as needed:

import os, subprocess, time

def recycle_pid():
    while True:
        try:
            pid, status, _ = os.wait3(os.WNOHANG)
            if pid == 0:
                break
            print("----- child %d terminated with status: %d" %(pid, status))
        except OSError,e:
            break

print("+++++ start pid:", subprocess.Popen("ls").pid)
recycle_pid()
print("+++++ start pid:", subprocess.Popen("ls").pid)
recycle_pid()
time.sleep(1)
recycle_pid()

recycle_pid is non-blocking, can call as needed.

xjdrew
  • 373
  • 4
  • 13
  • Thanks I think the solution is a bit complicated. I think there are some python library that would handle child processes (e.g. killing a zombie process). Do you know if there are any? Thanks – Winston Mar 19 '15 at 06:45
  • can't kill a zombie process, you can refer to this thread:http://stackoverflow.com/questions/16944886/how-to-kill-zombie-process – xjdrew Mar 19 '15 at 08:20
-2

Please look at http://docs.python.org/2/library/multiprocessing.html

It provides an API that is very similar to threads. You can wait for the child process to exit, if you want.

vowelless
  • 19
  • 4
  • I get hte distinct impression that he *doesn't* want to wait for the children to exit. – Blubber May 29 '13 at 07:24
  • It seems he wants to launch a bunch of processes which can be accomplished with Process.start(). This call is non-blocking. The parent-child relationship is still maintained, and so he can take whatever action he wants to take with the child. The parent will run in parallel. – vowelless May 29 '13 at 07:42
  • Thanks for the reply. Actually the child process is a shell command, which can't be executed by multiprocessing. Would there be any alternative? – Winston May 29 '13 at 08:19
  • You could do: **1**. Create N Process objects **2**. Call start() on all the objects **3**. The object then calls subprocess.call That way, you can execute shell commands using multiprocess. – vowelless May 29 '13 at 18:24