268

I am looking for a way to clean up the mess when my top-level script exits.

Especially if I want to use set -e, I wish the background process would die when the script exits.

user513951
  • 12,445
  • 7
  • 65
  • 82
elmarco
  • 31,633
  • 21
  • 64
  • 68
  • @DanielKaplan Try e.g. `p=$(bash -c 'sleep 2 >/dev/null & echo $!'); sleep 1; ps -f -p "$p"` to see that `sleep 2` command is still running after `bash` has exited. – jarno Dec 23 '22 at 12:57
  • @DanielKaplan The `sleep 2` command is running in background as a separate process; its command ends with `&`. – jarno Jan 01 '23 at 12:22
  • @jarno Apologies. I was incorrect about my first comment so I've deleted my others. – Daniel Kaplan Jan 01 '23 at 23:29

16 Answers16

255

This works for me (improved thanks to the commenters):

trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT
  • kill -- -$$ sends a SIGTERM to the whole process group, thus killing also descendants. The <PGID> in kill -- -<PGID> is the group process id, which often, but not necessarily, is the PID that $$ variable contains. The few times PGID and PID differ you can use ps and other similar tools you can obtain the PGID, in your script.

    For example: pgid="$(ps -o pgid= $$ | grep -o '[0-9]*')" stores PGID in $pgid.

  • Specifying signal EXIT is useful when using set -e (more details here).

Benjamin Buch
  • 4,752
  • 7
  • 28
  • 51
tokland
  • 66,169
  • 13
  • 144
  • 170
  • 2
    Should work well on the whole, but the child processes may change process groups. On the other hand it doesn't require job control, and may also get some grandchild processes missed by other solutions. – michaeljt Dec 13 '12 at 09:17
  • 8
    I don't quite understand `-$$`. It evaluates to '-` eg `-1234`. In the kill manpage // builtin manpage a leading dash specifies the signal to be sent. However -- probably blocks that, but then the leading dash is undocumented otherwise. Any help? – Evan Benn Jul 11 '19 at 04:48
  • 9
    @EvanBenn: Check `man 2 kill`, which explains that when a PID is negative, the signal is sent to all processes in the process group with the provided ID (https://en.wikipedia.org/wiki/Process_group). It's confusing that this is not mentioned in `man 1 kill` or `man bash`, and could be considered a bug in the documentation. – user001 Sep 01 '19 at 22:32
  • `kill -- -$$` might not terminate anything depending on how your script is called, e.g. if you call it like `(sleep 100 & your_script)` in shell. On the other hand, if you use `kill -- 0` in the trap, it would terminate `sleep 100` and the enclosing shell, too. – jarno Jul 03 '20 at 10:24
  • What if one needs to execute some cleanup after the child processes are terminated? I guess that instead of de-registering the trap, it would be possible to pass an empty string to it, so that SIGTERM is ignored: `trap 'trap " " SIGTERM; kill 0; wait; cleanup SIGINT SIGTERM`. Do you see any problem with this? – Egidio Docile Aug 16 '20 at 19:56
  • When I use ShellCheck on this, it says `SC2064: Use single quotes, otherwise this expands now rather than when signalled.`, should the answer be updated to reflect this? – Michael Yoo Nov 02 '20 at 03:39
  • @EgidioDocile yes, `kill 0` would not have an effect. – jarno Nov 03 '20 at 20:53
  • @MichaelYoo no, because the value of `$$` remains the same. – jarno Nov 03 '20 at 21:02
  • 6
    Why do we have two nested traps here? – Mohammed Noureldin Oct 27 '21 at 21:44
  • 4
    @MohammedNoureldin The inner `trap - SIGTERM` will reset the current script SIGTERM response to the default kill behavior. Then, when `kill -- -$$` is executed, the current script will receive SIGTERM and exit normally. – gallo Apr 10 '22 at 15:37
  • Should this expression maybe use single quotes instead of the double quotes? – Martin J.H. Apr 26 '22 at 11:41
  • 2
    Warning: you may need to disable the *builtin* shell kill command (`enable -n kill`) or use `/bin/kill`, as it may be that the builtin `kill` doesn't support the `-pgid` syntax. – Martijn Pieters Oct 12 '22 at 11:55
  • Where does this go in relation to the top-level script? – Merchako Jan 13 '23 at 19:56
  • 1
    @Merchako This tripped me up too. The trap command should go at the top of the script. I had it at the bottom. I probably would have discovered this if I had actually read the man page. – Joshua Jurgensmeier Jan 26 '23 at 16:38
  • @MartijnPieters am using bash `5.2.15` and `kill -SIGTERM -- -pid` appears to be wroking just fine. – laur Aug 06 '23 at 14:45
  • 1
    @laur: I actually had to go to the bash source code to try and figure this out, as I can't remember the exact context I was working in when I discovered I had to disable the built-in kill command. Via the [github `bash` mirror](https://github.com/bminor/bash/commit/7117c2d221b2aed4ede8600f6a36b7c1454b4f55#diff-c9460842d51f7e1e1954ceb4b0c9b2dfc47c39929642088b906d206c8ea1ea62) I learned that Bash 2.05 *fixed* support for `-pid`. So either I was working with a really ancient bash system, or perhaps it was a mac with some neutered bash version, or I wasn't working with bash at all. – Martijn Pieters Aug 09 '23 at 16:42
235

To clean up some mess, trap can be used. It can provide a list of stuff executed when a specific signal arrives:

trap "echo hello" SIGINT

but can also be used to execute something if the shell exits:

trap "killall background" EXIT

It's a builtin, so help trap will give you information (works with bash). If you only want to kill background jobs, you can do

trap 'kill $(jobs -p)' EXIT

Watch out to use single ', to prevent the shell from substituting the $() immediately.

Johannes Schaub - litb
  • 496,577
  • 130
  • 894
  • 1,212
  • 1
    then how do you killall *child* only ? (or am I missing something obvious) – elmarco Dec 11 '08 at 20:51
  • 24
    killall kills your children, but not you – orip Dec 11 '08 at 22:10
  • 5
    `kill $(jobs -p)` doesn't work in dash, because it executes command substitution in a subshell (see Command Substitution in man dash) – user1431317 Jun 15 '17 at 13:37
  • 35
    is `killall background` supposed to be a placeholder? `background` is not in the man page... – Evan Benn Jul 11 '19 at 04:45
  • @user1431317 but you could redirect the output of `jobs -p` to a temporary file and read it from there for `kill`. – jarno Aug 22 '20 at 05:39
  • 8
    `kill $(jobs -p)` is good, but prints usage info for 'kill' when there are no background jobs. IMHO, the best way for bash is `jobs -p | xargs -r kill` – Alek Sep 03 '20 at 20:39
  • 1
    Some shells do not trigger `EXIT` on `ctrl-c`. Adding `trap "exit" INT TERM ERR` along with `trap "kill 0" EXIT` fixes this problem – KJ Sudarshan Sep 20 '20 at 03:15
148

Update: https://stackoverflow.com/a/53714583/302079 improves this by adding exit status and a cleanup function.

trap "exit" INT TERM
trap "kill 0" EXIT

Why convert INT and TERM to exit? Because both should trigger the kill 0 without entering an infinite loop.

Why trigger kill 0 on EXIT? Because normal script exits should trigger kill 0, too.

Why kill 0? Because nested subshells need to be killed as well. This will take down the whole process tree.

korkman
  • 1,952
  • 1
  • 11
  • 14
  • 1
    This is the only version that worked for me on OSX with `GNU bash, version 4.3.33(1)-release (x86_64-apple-darwin14.0.0)`. – stiemannkj1 Feb 09 '15 at 20:15
  • The only properly solution for me on OSX – Phương Nguyễn Feb 26 '15 at 01:35
  • 7
    The only solution for my case on Debian. – DifferentPseudonym Jun 06 '15 at 17:39
  • 6
    Neither the answer by Johannes Schaub nor the one provided by tokland managed to kill the background processes my shell script started (on Debian). This solution worked. I don't know why this answer is not more upvoted. Could you expand more about what exactly `kill 0` means/does? – josch Nov 24 '16 at 15:09
  • 2
    @josch if you have yet found out, [here's an explanation for `kill 0`](http://unix.stackexchange.com/a/67552/23316) – sanmai Jan 09 '17 at 02:39
  • 11
    This is awesome, but also kills my parent shell :-( – vidstige Mar 07 '17 at 13:31
  • 12
    This solution is literally overkill. kill 0 (inside my script) ruined my whole X session! Perhaps in some cases kill 0 can be useful, but this does not change the fact that it is not general solution and should be avoided if possible unless there is very good reason to use it. It would be nice to add a warning that it may kill parent shell or even whole X session, not just background jobs of a script! – Lissanro Rayen Jun 02 '17 at 14:20
  • Dumb workaround that restricts the `kill 0` to just the shell script's processes: run the script under `setsid -w` (which has other side effects, unfortunately); https://stackoverflow.com/questions/6549663/how-to-set-process-group-of-a-shell-script – Ivan Kozik Nov 30 '17 at 10:56
  • I would add `QUIT` as well to the traps. – CMCDragonkai Jun 27 '18 at 01:32
  • Thank you. For me this improved it if my script just started a background process and stopped `trap "exit" INT TERM` `trap "sleep 0.1; kill 0" EXIT` – Drew LeSueur Sep 21 '18 at 17:38
  • 4
    While this might be an interesting solution under some circumstances, as pointed out by @vidstige this will kill the **whole process group** which includes the launching process (i.e. the parent shell in most cases). Definitely not something you want when you are running a script via an IDE. – matpen Dec 30 '18 at 17:36
  • This is the only answer that worked in my case. – ldog Mar 01 '23 at 20:37
24

The trap 'kill 0' SIGINT SIGTERM EXIT solution described in @tokland's answer is really nice, but latest Bash crashes with a segmentation fault when using it. That's because Bash, starting from v. 4.3, allows trap recursion, which becomes infinite in this case:

  1. shell process receives SIGINT or SIGTERM or EXIT;
  2. the signal gets trapped, executing kill 0, which sends SIGTERM to all processes in the group, including the shell itself;
  3. go to 1 :)

This can be worked around by manually de-registering the trap:

trap 'trap - SIGTERM && kill 0' SIGINT SIGTERM EXIT

The more fancy way that allows printing the received signal and avoids "Terminated:" messages:

#!/usr/bin/env bash

trap_with_arg() { # from https://stackoverflow.com/a/2183063/804678
  local func="$1"; shift
  for sig in "$@"; do
    trap "$func $sig" "$sig"
  done
}

stop() {
  trap - SIGINT EXIT
  printf '\n%s\n' "received $1, killing child processes"
  kill -s SIGINT 0
}

trap_with_arg 'stop' EXIT SIGINT SIGTERM SIGHUP

{ i=0; while (( ++i )); do sleep 0.5 && echo "a: $i"; done } &
{ i=0; while (( ++i )); do sleep 0.6 && echo "b: $i"; done } &

while true; do read; done

UPD: added a minimal example; improved stop function to avoid de-trapping unnecessary signals and to hide "Terminated:" messages from the output. Thanks Trevor Boyd Smith for the suggestions!

skozin
  • 3,789
  • 2
  • 21
  • 24
  • in `stop()` you provide the first argument as the signal number but then you hardcode what signals are being deregistered. rather than hardcode the signals being deregistered you could use the first argument to deregister in the `stop()` function (doing so would potentially stop other recursive signals (other than the 3 hardcoded)). – Trevor Boyd Smith Feb 16 '15 at 15:28
  • @TrevorBoydSmith, this would not work as expected, I guess. For example, the shell might be killed with `SIGINT`, but `kill 0` sends `SIGTERM`, which will get trapped once again. This will not produce infinite recursion, though, because `SIGTERM` will be de-trapped during the second `stop` call. – skozin Feb 16 '15 at 17:48
  • Probably, `trap - $1 && kill -s $1 0` should work better. I'll test and update this answer. Thank you for the nice idea! :) – skozin Feb 16 '15 at 17:50
  • Nope, `trap - $1 && kill -s $1 0` woldn't work too, as we can't kill with `EXIT`. But it is really sufficient do de-trap `TERM`, because `kill` sends this signal by default. – skozin Feb 16 '15 at 18:10
  • I tested recursion with `EXIT`, the `trap` signal-handler is always only executed once. – Trevor Boyd Smith Feb 16 '15 at 18:18
  • Updated the answer, now `stop` doesn't require modification, regardless of the signal list. And it seems that, when the shell gets killed with `INT`, it doesn't print "Terminated:" messages (which it does in case of `TERM`). – skozin Feb 16 '15 at 18:18
  • @TrevorBoydSmith, I meant that you cannot `kill -s EXIT pid`, which `stop` will try to do if it was written like `trap - $1 && kill -s $1 0` and invoked by `EXIT`. – skozin Feb 16 '15 at 18:20
  • `/bin/sh` symlinked to `dash` produces the error `trap: SIGINT: bad trap`. Removing `SIG` prefix from signal names (such that `SIGINT` becomes `INT`, etc.) works as expected with both `dash` and `bash`. – ack Jul 26 '17 at 12:15
  • Had I not know anything about bash, I would have assumed `"recieved $1, killing children"` meant "earning a dollar and murdering kids". You might wanna reword that. – Sapphire_Brick Sep 04 '20 at 21:44
  • 1
    @Sapphire_Brick done, now it should be harder to misinterpret the message. – skozin Oct 26 '20 at 10:14
  • I don't understand why do we have two nested traps? – Mohammed Noureldin Oct 27 '21 at 21:44
  • @MohammedNoureldin the second `trap - SIGINT EXIT` command clears the trap so the `stop` function doesn't get called recursively when the process finally exits. – skozin Nov 17 '21 at 14:57
21

trap 'kill $(jobs -p)' EXIT

I would make only minor changes to Johannes' answer and use jobs -pr to limit the kill to running processes and add a few more signals to the list:

trap 'kill $(jobs -pr)' SIGINT SIGTERM EXIT
Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
raytraced
  • 243
  • 2
  • 2
  • 1
    Why not kill the stopped jobs, too? In Bash EXIT trap will be run in case of SIGINT and SIGTERM, too, so the trap would be called twice in case of such a signal. – jarno Jul 03 '20 at 08:43
10

To be on the safe side I find it better to define a cleanup function and call it from trap:

cleanup() {
        local pids=$(jobs -pr)
        [ -n "$pids" ] && kill $pids
}
trap "cleanup" INT QUIT TERM EXIT [...]

or avoiding the function altogether:

trap '[ -n "$(jobs -pr)" ] && kill $(jobs -pr)' INT QUIT TERM EXIT [...]

Why? Because by simply using trap 'kill $(jobs -pr)' [...] one assumes that there will be background jobs running when the trap condition is signalled. When there are no jobs one will see the following (or similar) message:

kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]

because jobs -pr is empty - I ended in that 'trap' (pun intended).

tdaitx
  • 313
  • 2
  • 8
  • This test case `[ -n "$(jobs -pr)" ]` doesn't work on my bash. I use GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu). The "kill: usage" message keeps popping up. – Douwe van der Leest May 27 '19 at 06:52
  • I suspect it has to do with the fact that `jobs -pr` doesn't return the PIDs of the children of the background processes. It doesn't tear the entire process tree down, only trims off the roots. – Douwe van der Leest May 27 '19 at 07:06
4
function cleanup_func {
    sleep 0.5
    echo cleanup
}

trap "exit \$exit_code" INT TERM
trap "exit_code=\$?; cleanup_func; kill 0" EXIT

# exit 1
# exit 0

Like https://stackoverflow.com/a/22644006/10082476, but with added exit-code

Delaware
  • 45
  • 1
3

A nice version that works under Linux, BSD and MacOS X. First tries to send SIGTERM, and if it doesn't succeed, kills the process after 10 seconds.

KillJobs() {
    for job in $(jobs -p); do
            kill -s SIGTERM $job > /dev/null 2>&1 || (sleep 10 && kill -9 $job > /dev/null 2>&1 &)

    done
}

TrapQuit() {
    # Whatever you need to clean here
    KillJobs
}

trap TrapQuit EXIT

Please note that jobs does not include grand children processes.

Orsiris de Jong
  • 2,819
  • 1
  • 26
  • 48
1

I made an adaption of @tokland's answer combined with the knowledge from http://veithen.github.io/2014/11/16/sigterm-propagation.html when I noticed that trap doesn't trigger if I'm running a foreground process (not backgrounded with &):

#!/bin/bash

# killable-shell.sh: Kills itself and all children (the whole process group) when killed.
# Adapted from http://stackoverflow.com/a/2173421 and http://veithen.github.io/2014/11/16/sigterm-propagation.html
# Note: Does not work (and cannot work) when the shell itself is killed with SIGKILL, for then the trap is not triggered.
trap "trap - SIGTERM && echo 'Caught SIGTERM, sending SIGTERM to process group' && kill -- -$$" SIGINT SIGTERM EXIT

echo $@
"$@" &
PID=$!
wait $PID
trap - SIGINT SIGTERM EXIT
wait $PID

Example of it working:

$ bash killable-shell.sh sleep 100
sleep 100
^Z
[1]  + 31568 suspended  bash killable-shell.sh sleep 100

$ ps aux | grep "sleep"
niklas   31568  0.0  0.0  19640  1440 pts/18   T    01:30   0:00 bash killable-shell.sh sleep 100
niklas   31569  0.0  0.0  14404   616 pts/18   T    01:30   0:00 sleep 100
niklas   31605  0.0  0.0  18956   936 pts/18   S+   01:30   0:00 grep --color=auto sleep

$ bg
[1]  + 31568 continued  bash killable-shell.sh sleep 100

$ kill 31568
Caught SIGTERM, sending SIGTERM to process group
[1]  + 31568 terminated  bash killable-shell.sh sleep 100

$ ps aux | grep "sleep"
niklas   31717  0.0  0.0  18956   936 pts/18   S+   01:31   0:00 grep --color=auto sleep
nh2
  • 24,526
  • 11
  • 79
  • 128
1

I finally have found a solution that appears to work in all cases to kill all descents recursively regardless of whether they are jobs, or sub-processes. The other solutions here all seemed to fail with things such as:

while ! ffmpeg ....
do
  sleep 1
done

In my situation, ffmpeg would keep running after the parent script exited.

I found a solution here to recursively getting the PIDs of all child processes recursively and used that in the trap handler thus:

cleanup() {
    # kill all processes whose parent is this process
    kill $(pidtree $$ | tac)
}

pidtree() (
    [ -n "$ZSH_VERSION"  ] && setopt shwordsplit
    declare -A CHILDS
    while read P PP;do
        CHILDS[$PP]+=" $P"
    done < <(ps -e -o pid= -o ppid=)
    walk() {
        echo $1
        for i in ${CHILDS[$1]};do
            walk $i
        done
    }

    for i in "$@";do
        walk $i
    done
)

trap cleanup EXIT

The above placed at the start of a bash script succeeds in killing all child processes. Note that pidtree is called with $$ which is the PID of the bash script that is exiting and the list of PIDs (one per line) is reversed using tac to try and ensure that prarent processes are killed only after their children to avoid possible race conditions in loops such as the example I gave.

Dino Dini
  • 433
  • 3
  • 6
0

Universal solution which works also in sh (jobs there does not output anything to stdout):

trap "pkill -P $$" EXIT INT
excitoon
  • 328
  • 3
  • 11
  • This only kills child processes. It wouldn't handle common cases like jobs started by a subshell (which end up with PPID 1). Killing the process group with `-g` would do that. – Trevor Robinson May 27 '23 at 17:19
  • Well, I assumed typical scenario when sub-processes are responsible for killing their children. – excitoon May 28 '23 at 20:22
0

None of the answers here worked for me in the case of a continuous integration (CI) script that starts background processes from subshells. For example:

(cd packages/server && npm start &)

The subshell terminates after starting the background process, which therefore ends up with parent PID 1.

With PPID not an option, the only portable (Linux and MacOS) and generic (independent of process name, listening ports, etc.) approach left is the process group (PGID). However, I can't just kill that because it would kill the script process, which would fail the CI job.

# Terminate the given process group, excluding this process. Allows 2 seconds
# for graceful termination before killing remaining processes. This allows
# shutdown errors to be printed, while handling processes that fail to
# terminate quickly.
kill_subprocesses() {
  echo "Terminating subprocesses of PGID $1 excluding PID $$"
  # Get all PIDs in this process group except this process
  # (pgrep on NetBSD/MacOS does this by default, but Linux pgrep does not)
  # Uses a heredoc instead of piping to avoid including the grep PID
  pids=$(grep -Ev "\\<$$\\>" <<<"$(pgrep -g "$1")")
  if [ -n "$pids" ]; then
    echo "Terminating processes: ${pids//$'\n'/, }"
    # shellcheck disable=SC2086
    kill $pids || true
  fi
  sleep 2
  # Check for remaining processes and kill them
  pids=$(grep -Ev "\\<$$\\>" <<<"$(pgrep -g "$1")")
  if [ -n "$pids" ]; then
    echo "Killing remaining processes: ${pids//$'\n'/, }"
    # shellcheck disable=SC2086
    kill -9 $pids || true
  fi
}

# Terminate subprocesses on exit or interrupt
# shellcheck disable=SC2064
trap "kill_subprocesses $$" EXIT SIGINT SIGTERM
Trevor Robinson
  • 15,694
  • 5
  • 73
  • 72
-1

jobs -p does not work in all shells if called in a sub-shell, possibly unless its output is redirected into a file but not a pipe. (I assume it was originally intended for interactive use only.)

What about the following:

trap 'while kill %% 2>/dev/null; do jobs > /dev/null; done' INT TERM EXIT [...]

The call to "jobs" is needed with Debian's dash shell, which fails to update the current job ("%%") if it is missing.

michaeljt
  • 1,106
  • 8
  • 17
  • Hmm interesting approach, but it does not seem to work. Consider scipt `trap 'echo in trap; set -x; trap - TERM EXIT; while kill %% 2>/dev/null; do jobs > /dev/null; done; set +x' INT TERM EXIT; sleep 100 & while true; do printf .; sleep 1; done` If you run it in Bash (5.0.3) and try to terminate, there seems to be an infinite loop. However, if you terminate it again, it works. Even by Dash (0.5.10.2-6) you have to terminate it twice. – jarno Jul 03 '20 at 14:12
-1

Another option is it to have the script set itself as the process group leader, and trap a killpg on your process group on exit.

EDIT: a possible bash hack to create a new process group is to use setsid(1) but only if we're not already the process group leader (can query it with ps).

Placing this at the beginning of the script can achieve that.

# Create a process group and exec the script as its leader if necessary
[[ "$(ps -o pgid= $$)" -eq "$$" ]] || exec setsid /bin/bash "$0" "$@"

Then signaling the process group with kill -- -$$ would work as expected even when script is not already the process group leader.

orip
  • 73,323
  • 21
  • 116
  • 148
  • How do you set the process as process group leader? What is "killpg"? – jarno Jul 03 '20 at 09:41
  • 1
    [killpg](https://man7.org/linux/man-pages/man3/killpg.3.html) is C api to send signal (=kill) to a Process Group so is exactly what the `kill -- -$$` and `kill 0` answers suggest; starting a new proccess group is the novel idea here but needs details on how to do this _from bash_... – Beni Cherniavsky-Paskin Aug 13 '23 at 16:01
  • `setsid(1)` can do it, and we can test whether we're the leader with `ps`. So the bash hack would be to add something like this to the beginning of the script ` [[ "$(ps -o pgid= $$)" -eq "$$" ]] || exec setsid /bin/bash "$0" "$@"` – orip Aug 14 '23 at 13:23
-1

Just for diversity I will post variation of https://stackoverflow.com/a/2173421/102484 , because that solution leads to message "Terminated" in my environment:

trap 'test -z "$intrap" && export intrap=1 && kill -- -$$' SIGINT SIGTERM EXIT
noonex
  • 1,975
  • 1
  • 16
  • 18
-4

So script the loading of the script. Run a killall (or whatever is available on your OS) command that executes as soon as the script is finished.

Oli
  • 235,628
  • 64
  • 220
  • 299