16

I have a program spawning and communicating with CPU heavy, unstable processes, not created by me. If my app crashes or is killed by SIGKILL, I want the subprocesses to get killed as well, so the user don´t have to track them down and kill them manually.

I know this topic has been covered before, but I have tried all methods described, and none of them seem to live up to survive the test.

I know it must be possible, since terminals do it all the time. If I run something in a terminal, and kill the terminal, the stuff always dies.

I have tried atexit, double fork and ptys. atexit doesn't work for sigkill; double fork doesn't work at all; and ptys I have found no way to work with using python.

Today, I found out about prctl(PR_SET_PDEATHSIG, SIGKILL), which should be a way for child processes to order a kill on themselves, when their parent dies. I tried to use it with popen, but it seams to have no effect at all:

import ctypes, subprocess
libc = ctypes.CDLL('/lib/libc.so.6')
PR_SET_PDEATHSIG = 1; TERM = 15
implant_bomb = lambda: libc.prctl(PR_SET_PDEATHSIG, TERM)
subprocess.Popen(['gnuchess'], preexec_fn=implant_bomb)

In the above, the child is created and the parent exits. Now you would expect gnuchess to receive a SIGKILL and die, but it doesn't. I can still find it in my process manager using 100% CPU.

Can anybody tell me if there is something wrong with my use of prctl?, or do you know how terminals manage to kill their children?

Thomas Ahle
  • 30,774
  • 21
  • 92
  • 114
  • 28
    I find this title very disturbing :( – Dalbir Singh Dec 10 '09 at 23:53
  • 6
    Should be on moms4moms.com stackexchange site... – Antony Dec 11 '09 at 00:17
  • 3
    Actually, I found that your approach seemed to work just fine - perhaps it's just that your test program (gnuchess) actually does some double-forking, or something similar? I tested with just a simple python script - see my "answer" below for the exact test code I used. (Sorry, would have included the code in the comment, but I guess you can't have code blocks in them...) – Paul Molodowitch Jul 11 '13 at 18:07

8 Answers8

14

I know it's been years, but I found a simple (slightly hacky) solution to this problem. From your parent process, wrapping all your calls around a very simple C program that calls prctl() and then exec() solves this problem on Linux. I call it "yeshup":

#include <linux/prctl.h>
#include <signal.h>
#include <unistd.h>

int main(int argc, char **argv) {
     if(argc < 2)
          return 1;
     prctl(PR_SET_PDEATHSIG, SIGHUP, 0, 0, 0);
     return execvp(argv[1], &argv[1]);
}

When spawning your child processes from Python (or any other language), you can run "yeshup gnuchess [argments]." You'll find that, when the parent process is killed, all your child processes (should) be given SIGHUP nicely.

This works because Linux will honor the call to prctl (not clear it) even after execvp is called (which effectively "transforms" the yeshup process into a gnuchess process, or whatever command you specify there), unlike fork().

coolbho3k
  • 176
  • 1
  • 6
  • Can some variant of this fly for a bash-based test case? I've been monkeying with it without any luck. – Rhys Ulerich Oct 29 '13 at 05:08
  • 1
    But how is this different from calling `fork(); prctl(); exec()`? – Thomas Ahle Jun 20 '14 at 13:39
  • Thomas: That might work, but remember, OP is trying to call a subprocess from Python. – coolbho3k Jun 30 '14 at 18:55
  • 1
    @coolbho3k I know, but the preexecfn in Python is exactly(?) like plugging a call I'm between fork and exec. Which is equivalent to this solution. – Thomas Ahle Aug 11 '14 at 08:20
  • @coolbho3k - i think this is the best solution - also thoughts on adding a `if (getppid() == 1) return 1;` as a check after `prctl()` - that way if the parent process dies and init takes over the orphan as a parent before `prctl()` , you're not waiting on init to die (you could be waiting a while :) – gnr Oct 16 '15 at 00:18
6

prctl's PR_SET_DEATHSIG can only be set for this very process that's calling prctl -- not for any other process, including this specific process's children. The way the man page I'm pointing to expresses this is "This value is cleared upon a fork()" -- fork, of course, is the way other processes are spawned (in Linux and any other Unix-y OS).

If you have no control over the code you want to run in subprocesses (as would be the case, essentially, for your gnuchess example), I suggest you first spawn a separate small "monitor" process with the role of keeping track of all of its siblings (your parent process can let the monitor know about those siblings' pids as it spawns them) and sending them killer signals when the common parent dies (the monitor needs to poll for that, waking up every N seconds for some N of your choice to check if the parent's still alive; use select to wait for more info from the parent with a timeout of N seconds, within a loop).

Not trivial, but then such system tasks often aren't. Terminals do it differently (via the concept of a "controlling terminal" for a process group) but of course it's trivial for any child to block THAT off (double forks, nohup, and so on).

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • 5
    `preexec_fn` is called after `fork()`, and the man page doesn't say this flag is cleared on `exec()`. – Denis Otkidach Dec 11 '09 at 08:52
  • As Denis writes, I'm under the impression, that the preexec_fn parameter will place my prctl call between the fork and the exec calls. The idea of a monitor is actually quite good. Unless it crashes of course, so it has to be quite simple. Can I know its parrent is dead when it recieves SIGHUP? And then let it sigterm its brothers and sisters. Can you tell me more about spawning controlling terminals? – Thomas Ahle Dec 11 '09 at 10:14
  • From [\[Python\]: Popen Constructor](https://docs.python.org/3/library/subprocess.html#popen-constructor) (probably changed along the way): _"If preexec_fn is set to a callable object, this object will be called **in the child process** just before the child is executed. (POSIX only)."_ – CristiFati Oct 26 '17 at 16:44
3

Actually I found that your original approach worked just fine for me - here's the exact example code I tested with which worked:

echoer.py

#!/bin/env python

import time
import sys
i = 0
try:
    while True:
        i += 1
        print i
        time.sleep(1)
except KeyboardInterrupt:
    print "\nechoer caught KeyboardInterrupt"
    exit(0)

parentProc.py

#!/bin/env python

import ctypes
import subprocess
import time

libc = ctypes.CDLL('/lib64/libc.so.6')
PR_SET_PDEATHSIG = 1
SIGINT = 2
SIGTERM = 15

def set_death_signal(signal):
    libc.prctl(PR_SET_PDEATHSIG, signal)

def set_death_signal_int():
    set_death_signal(SIGINT)

def set_death_signal_term():
    set_death_signal(SIGTERM)

#subprocess.Popen(['./echoer.py'], preexec_fn=set_death_signal_term)
subprocess.Popen(['./echoer.py'], preexec_fn=set_death_signal_int)
time.sleep(1.5)
print "parentProc exiting..."
Paul Molodowitch
  • 1,366
  • 3
  • 12
  • 29
  • doesn't work "better", just different - to see the diff for yourself (on a linux system), create both the above files in the same dir, make them executable, and then run "parentProc.py". The child process should get a KeyboardInterrupt signal sent to it, which you'll know because it will print "echoer caught KeyboardInterrupt". If you change parentProc.py to use set_death_signal_term, the echoer will still get killed, but in a more abrupt fashion. – Paul Molodowitch Aug 09 '14 at 00:39
1

I'm wondering if the PR_SET_PDEATHSIG flag is getting cleared, even though you set it after you fork (and before exec), so it seems from the docs like it shouldn't get cleared.

In order to test that theory, you could try the following: use the same code to run a subprocess that's written in C and basically just calls prctl(PR_GET_PDEATHSIG, &result) and prints the result.

Another thing you might try: adding explicit zeros for arg3, arg4, and arg5 when you call prctl. I.e.:

>>> implant_bomb = lambda: libc.prctl(PR_SET_PDEATHSIG, TERM, 0, 0, 0)
Edward Loper
  • 15,374
  • 7
  • 43
  • 52
1

I thought the double fork was to detach from a controlling terminal. I'm not sure how you are trying to use it.

It's a hack, but you could always call 'ps' and search for the process name your trying to kill.

BillMan
  • 9,434
  • 9
  • 32
  • 52
  • But how can I call ps when my controlling process is dead? That's the entire point. The user of my app won't know that the children aren't dead, and will just feel his/her computer being much slower until reboot. – Thomas Ahle Dec 11 '09 at 10:08
  • Can you wrap the program that is being created with your own executable and then write the PID to a file when its started? – BillMan Dec 13 '09 at 09:32
  • I could probably find a way to clean stuff up on restart, but I don't know if I will be restarted. Thus I need a way to clean up at the time I die. – Thomas Ahle Dec 17 '09 at 14:18
1

I've seen very nasty ways of "clean-up" using things like ps xuawww | grep myApp | awk '{ print $1}' | xargs -n1 kill -9

The client process, if popened, can catch SIG_PIPE and die. There are many ways to go about this, but it really depends on a lot of factors. If you throw some ping code (ping to parent) in the child, you can ensure that a SIG_PIPE is issued on death. If it catches it, which it should, it'll terminate. You'd need bidirectional communication for this to work correctly... or to always block against the client as the originator of communication. If you don't want to modify the child, ignore this.

Assuming that you don't expect the actual Python interpreter to segfault, you could add each PID to a sequence, and then kill on exit. This should be safe for exiting and even uncaught exceptions. Python has facilities to perform exit code... for clean-up.

Here's some safer nasty: Append each child PID to a file, including your master process (separate file). Use file locking. Build a watchdog daemon that looks at the flock() state of your master pid. If it's not locked, kill every PID in your child PID list. Run the same code on startup.

More nasty: Write the PIDs to files, as above, then invoke your app in a sub-shell: (./myMaster; ./killMyChildren)

pestilence669
  • 5,698
  • 1
  • 23
  • 35
  • If you read my codesnap, you'll see that I run gnuchess in the new process, so there is now way I can run it as a thread. Does Popen not use fork? – Thomas Ahle Dec 11 '09 at 10:06
1

There is some security restriction to take into account because if we call setuid after execv he child cannot receive signal. The complete list of this restrictions is here

good luck !
/Mohamed

Mohamed Hamzaoui
  • 325
  • 4
  • 15
  • Interesting, so `the signal gets delivered only if parent process has sufficient privileges to send signals to child processes. Typically any child process running with higher privilege than its parent will receive no signal.` I haven't noticed such a thing. How can I detect if it somehow increases its privilege? – Thomas Ahle Jun 20 '14 at 13:58
1

Other answers mention prctl's PR_SET_DEATHSIG but leave out the fact that this can be set from the command line using the setpriv command:

setpriv --pdeathsig HUP [command] &
obadz
  • 889
  • 4
  • 10