3

I have a python script: zombie.py

from multiprocessing import Process
from time import sleep
import atexit

def foo():
    while True:
        sleep(10)

@atexit.register
def stop_foo():
    p.terminate()
    p.join()

if __name__ == '__main__':
    p = Process(target=foo)
    p.start()

    while True:
        sleep(10)

When I run this with python zombie.py & and kill the parent process with kill -2, the stop() is correctly called and both processes terminate.

Now, suppose I have a bash script zombie.sh:

#!/bin/sh

python zombie.py &

echo "done"

And I run ./zombie.sh from the command line.

Now, stop() never gets called when the parent gets killed. If I run kill -2 on the parent process, nothing happens. kill -15 or kill -9 both just kill the parent process, but not the child:

[foo@bar ~]$ ./zombie.sh 
done
[foo@bar ~]$ ps -ef | grep zombie | grep -v grep
foo 27220     1  0 17:57 pts/3    00:00:00 python zombie.py
foo 27221 27220  0 17:57 pts/3    00:00:00 python zombie.py
[foo@bar ~]$ kill -2 27220
[foo@bar ~]$ ps -ef | grep zombie | grep -v grep
foo 27220     1  0 17:57 pts/3    00:00:00 python zombie.py
foo 27221 27220  0 17:57 pts/3    00:00:00 python zombie.py
[foo@bar ~]$ kill 27220
[foo@bar ~]$ ps -ef | grep zombie | grep -v grep
foo 27221     1  0 17:57 pts/3    00:00:00 python zombie.py

What is going on here? How can I make sure the child process dies with the parent?

user545424
  • 15,713
  • 11
  • 56
  • 70
  • related: [How to make child process die after parent exits?](http://stackoverflow.com/q/284325/4279) – jfs Feb 28 '14 at 01:43

2 Answers2

2

Neither the atexit nor the p.daemon = True will truely ensure that the child process will die with the father. Receiving a SIGTERM will not trigger the atexit routines.

To make sure the child gets killed upon its father's death you will have to install a signal handler in the father. This way you can react on most signals (SIGQUIT, SIGINT, SIGHUP, SIGTERM, ...) but not on SIGKILL; there simply is no way to react on that signal from within the process which receives it.

Install a signal handler for all useful signals and in that handler kill the child process.

Alfe
  • 56,346
  • 20
  • 107
  • 159
  • 1
    Installing singal handlers won't "truely ensure" it either. On Linux, there is `prctl()` but it would be an overkill. – jfs Feb 28 '14 at 01:42
  • What do you mean by father? The bash script? I have already installed the `atexit.register` in the python script, which should respond to `SIGINT` from `kill -2`. – user545424 Feb 28 '14 at 16:49
  • No, the father process I referred to is the Python script you started. By using the `subprocess` module it creates a child process. – Alfe Feb 28 '14 at 23:59
  • And actually, no, the `atexit` reacts to a controlled exiting of the Python script. The SIGINT is caught by the interpreter and translated to a `KeyboardInterrupt` exception which then propagates orderly and leads as a result to a controlled termination of the script; thus the `atexit` is triggered. This will not function anymore if any other signal (which the Python interpreter does not handle this gracefully) terminates the process. – Alfe Mar 01 '14 at 00:02
1

Update: This solution doesn't work for processes killed by a signal.


Your child process is not a zombie. It is alive.

If you want the child process to be killed when its parent exits normally then set p.daemon = True before p.start(). From the docs:

When a process exits, it attempts to terminate all of its daemonic child processes.

Looking at the source code, it is clear that multiprocessing uses atexit callback to kill its daemonic children i.e., it won't work if the parent is killed by a signal. For example:

#!/usr/bin/env python
import logging
import os
import signal
import sys
from multiprocessing import Process, log_to_stderr
from threading import Timer
from time import sleep

def foo():
    while True:
        sleep(1)

if __name__ == '__main__':
    log_to_stderr().setLevel(logging.DEBUG)
    p = Process(target=foo)
    p.daemon = True
    p.start()

    # either kill itself or exit normally in 5 seconds
    if '--kill' in sys.argv:
        Timer(5, os.kill, [os.getpid(), signal.SIGTERM]).start()
    else: # exit normally
        sleep(5)

Output

$ python kill-orphan.py
[INFO/Process-1] child process calling self.run()
[INFO/MainProcess] process shutting down
[DEBUG/MainProcess] running all "atexit" finalizers with priority >= 0
[INFO/MainProcess] calling terminate() for daemon Process-1
[INFO/MainProcess] calling join() for process Process-1
[DEBUG/MainProcess] running the remaining "atexit" finalizers

Notice "calling terminate() for daemon" line.

Output (with --kill)

$ python kill-orphan.py --kill
[INFO/Process-1] child process calling self.run()

The log shows that if the parent is killed by a signal then "atexit" callback is not called (and ps shows that the child is alive in this case). See also Multiprocess Daemon Not Terminating on Parent Exit.

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • This doesn't work on my system. I'm running `Scientific Linux release 6.5 (Carbon)`. – user545424 Feb 28 '14 at 16:46
  • @user545424: It fails on my system too (Ubuntu). I'd expected it to work for such signals as `SIGTERM`. It works for `SIGINT` though (`kill -2`). – jfs Feb 28 '14 at 18:17