Why can't I use job control in a bash script?

Question

In this answer to another question, I was told that

in scripts you don't have job control (and trying to turn it on is stupid)

This is the first time I've heard this, and I've pored over the bash.info section on Job Control (chapter 7), finding no mention of either of these assertions. [Update: The man page is a little better, mentioning 'typical' use, default settings, and terminal I/O, but no real reason why job control is particularly ill-advised for scripts.]

So why doesn't script-based job-control work, and what makes it a bad practice (aka 'stupid')?

Edit: The script in question starts a background process, starts a second background process, then attempts to put the first process back into the foreground so that it has normal terminal I/O (as if run directly), which can then be redirected from outside the script. Can't do that to a background process.

As noted by the accepted answer to the other question, there exist other scripts that solve that particular problem without attempting job control. Fine. And the lambasted script uses a hard-coded job number — Obviously bad. But I'm trying to understand whether job control is a fundamentally doomed approach. It still seems like maybe it could work...

Please add a simple example showing how you would find this useful that can't be done easily without job control. — dwc, Mar 27 '09 at 15:49

vladr · Accepted Answer · 2009-03-28T15:50:09.910

54

What he meant is that job control is by default turned off in non-interactive mode (i.e. in a script.)

From the bash man page:

JOB CONTROL
       Job  control refers to the ability to selectively stop (suspend)
       the execution of processes and continue (resume) their execution at a
       later point.
       A user typically employs this facility via an interactive interface
       supplied jointly by the system’s terminal driver and bash.

and

   set [--abefhkmnptuvxBCHP] [-o option] [arg ...]
      ...
      -m      Monitor mode.  Job control is enabled.  This option is on by
              default for interactive shells on systems that support it (see
              JOB CONTROL above).  Background processes run in a separate
              process group and a line containing their exit status  is
              printed  upon  their completion.

When he said "is stupid" he meant that not only:

is job control meant mostly for facilitating interactive control (whereas a script can work directly with the pid's), but also
I quote his original answer, ... relies on the fact that you didn't start any other jobs previously in the script which is a bad assumption to make. Which is quite correct.

UPDATE

In answer to your comment: yes, nobody will stop you from using job control in your bash script -- there is no hard case for forcefully disabling set -m (i.e. yes, job control from the script will work if you want it to.) Remember that in the end, especially in scripting, there always are more than one way to skin a cat, but some ways are more portable, more reliable, make it simpler to handle error cases, parse the output, etc.

You particular circumstances may or may not warrant a way different from what lhunath (and other users) deem "best practices".

edited Mar 28 '09 at 15:50

answered Mar 27 '09 at 15:46

vladr

65,483
18
129
130

2

+1 Accurately detailed. Job control is a feature to make handling jobs on the (interactive) prompt more convenient. There is no reason why anyone would want it in scripts as you can just keep the PIDs of your background processes and *wait* on or *kill* them. – lhunath Mar 27 '09 at 17:21
Thanks! Weird that the man page has better info than the bash.info file. – system PAUSE Mar 27 '09 at 21:11
2

Ok, I *get* that a hard-coded job number is a bad idea. No issue there. But words like "*by default* for interactive" and "user *typically* employs" and "meant *mostly* for" all hint strongly that there is *some* esoteric use case for job control in a script. Otherwise set -m should fail in scripts. – system PAUSE Mar 27 '09 at 21:21
27

There is one very important reason for enabling job control to be useful inside scripts: the side-effect it has of placing background processes in their own process groups. This makes it much, much easier to send signels to them _and their children_ with one simple command: `kill - -$pgid`. All other ways of dealing with signaling entire trees of processes either involve elaborate (sometimes even recursive) functions, which are often bugnests, or risk killing the parent in the process (no pun intended). – ack Dec 07 '13 at 05:27
@ack, your comment makes for an answer, that's a very good point – iruvar Jan 15 '15 at 02:26
3

How would you then, for example, start a background process, run a couple of setup commands, then move it to the foreground, using pids (I do this a lot for docker containers) ? – Jonathan Jul 06 '17 at 09:06
Similar to @Jonathan's question, I have a use-case where I want some process that I launch to complete its internal setup before handing control back to the launching bash script. With job control, it seems like this can be achieved by the process sending itself SIGSTOP when it's properly set up, and the calling script can then call bg to place the now-fully-ready process into the background. Is there a better way to accomplish this kind of "await" behavior in bash? – Seth P Oct 22 '17 at 21:22

Andreas Spindler · Answer 2 · 2012-10-02T14:40:12.193

Job control with bg and fg is useful only in interactive shells. But & in conjunction with wait is useful in scripts too.

On multiprocessor systems spawning background jobs can greatly improve the script's performance, e.g. in build scripts where you want to start at least one compiler per CPU, or process images using ImageMagick tools parallely etc.

The following example runs up to 8 parallel gcc's to compile all source files in an array:

#!bash
...
for ((i = 0, end=${#sourcefiles[@]}; i < end;)); do
    for ((cpu_num = 0; cpu_num < 8; cpu_num++, i++)); do
        if ((i < end)); then gcc ${sourcefiles[$i]} & fi
    done
    wait
done

There is nothing "stupid" about this. But you'll require the wait command, which waits for all background jobs before the script continues. The PID of the last background job is stored in the $! variable, so you may also wait ${!}. Note also the nice command.

Sometimes such code is useful in makefiles:

buildall:
    for cpp_file in *.cpp; do gcc -c $$cpp_file & done; wait

This gives much finer control than make -j.

Note that & is a line terminator like ; (write command& not command&;).

Hope this helps.

for the readers: `wait` also allows for multiple pids, e.g. `wait 3940 4001 4012 4024`, but will wait for *all* of them to finish before continuing. — zamnuts, Nov 13 '13 at 07:00
Recently I byte-compiled Emacs/Lisp _.el_ files to _.elc_ in a script. After `wait` all Emacssen had finished but not all _.elc_ files were there. I additionally had to wait for them, like in `while [[ ! -e $file ]]; do :; done`. Happened under Windows/Cygwin, but I reckon this could happen under any file system. So outside of Makefiles, if files are required, better not simply trust `wait`. — Andreas Spindler, May 24 '15 at 08:34

score 8 · Answer 3 · answered Mar 27 '09 at 15:55

Job control is useful only when you are running an interactive shell, i.e., you know that stdin and stdout are connected to a terminal device (/dev/pts/* on Linux). Then, it makes sense to have something on foreground, something else on background, etc.

Scripts, on the other hand, doesn't have such guarantee. Scripts can be made executable, and run without any terminal attached. It doesn't make sense to have foreground or background processes in this case.

You can, however, run other commands non-interactively on the background (appending "&" to the command line) and capture their PIDs with $!. Then you use kill to kill or suspend them (simulating Ctrl-C or Ctrl-Z on the terminal, it the shell was interactive). You can also use wait (instead of fg) to wait for the background process to finish.

The "fg 1" was intended specifically to cause the stdin and stdout of the &'d process to reattach to the interactive terminal session. The caller of the script (a person or another script) could then choose whether to redirect them. — system PAUSE, Mar 27 '09 at 21:03

score 7 · Answer 4 · answered Apr 14 '14 at 20:30

It could be useful to turn on job control in a script to set traps on SIGCHLD. The JOB CONTROL section in the manual says:

The shell learns immediately whenever a job changes state. Normally, bash waits until it is about to print a prompt before reporting changes in a job's status so as to not interrupt any other output. If the -b option to the set builtin command is enabled, bash reports such changes immediately. Any trap on SIGCHLD is executed for each child that exits.

(emphasis is mine)

Take the following script, as an example:

dualbus@debian:~$ cat children.bash 
#!/bin/bash

set -m
count=0 limit=3
trap 'counter && { job & }' CHLD
job() {
  local amount=$((RANDOM % 8))
  echo "sleeping $amount seconds"
  sleep "$amount"
}
counter() {
  ((count++ < limit))
}
counter && { job & }
wait
dualbus@debian:~$ chmod +x children.bash 
dualbus@debian:~$ ./children.bash 
sleeping 6 seconds
sleeping 0 seconds
sleeping 7 seconds

Note: CHLD trapping seems to be broken as of bash 4.3

In bash 4.3, you could use 'wait -n' to achieve the same thing, though:

dualbus@debian:~$ cat waitn.bash 
#!/home/dualbus/local/bin/bash

count=0 limit=3
trap 'kill "$pid"; exit' INT
job() {
  local amount=$((RANDOM % 8))
  echo "sleeping $amount seconds"
  sleep "$amount"
}
for ((i=0; i<limit; i++)); do
  ((i>0)) && wait -n; job & pid=$!
done
dualbus@debian:~$ chmod +x waitn.bash 
dualbus@debian:~$ ./waitn.bash 
sleeping 3 seconds
sleeping 0 seconds
sleeping 5 seconds

You could argue that there are other ways to do this in a more portable way, that is, without CHLD or wait -n:

dualbus@debian:~$ cat portable.sh 
#!/bin/sh

count=0 limit=3
trap 'counter && { brand; job & }; wait' USR1
unset RANDOM; rseed=123459876$$
brand() {
  [ "$rseed" -eq 0 ] && rseed=123459876
  h=$((rseed / 127773))
  l=$((rseed % 127773))
  rseed=$((16807 * l - 2836 * h))
  RANDOM=$((rseed & 32767))
}
job() {
  amount=$((RANDOM % 8))
  echo "sleeping $amount seconds"
  sleep "$amount"
  kill -USR1 "$$"
}
counter() {
  [ "$count" -lt "$limit" ]; ret=$?
  count=$((count+1))
  return "$ret"
}
counter && { brand; job & }
wait
dualbus@debian:~$ chmod +x portable.sh 
dualbus@debian:~$ ./portable.sh 
sleeping 2 seconds
sleeping 5 seconds
sleeping 6 seconds

So, in conclusion, set -m is not that useful in scripts, since the only interesting feature it brings to scripts is being able to work with SIGCHLD. And there are other ways to achieve the same thing either shorter (wait -n) or more portable (sending signals yourself).

score 2 · Answer 5 · answered Mar 27 '09 at 15:46

Bash does support job control, as you say. In shell script writing, there is often an assumption that you can't rely on the fact that you have bash, but that you have the vanilla Bourne shell (sh), which historically did not have job control.

I'm hard-pressed these days to imagine a system in which you are honestly restricted to the real Bourne shell. Most systems' /bin/sh will be linked to bash. Still, it's possible. One thing you can do is instead of specifying

#!/bin/sh

You can do:

#!/bin/bash

That, and your documentation, would make it clear your script needs bash.

On Ubuntu, /bin/sh is not linked to Bash. So you need #!/bin/bash. — emk, Mar 27 '09 at 20:58

score 0 · Answer 6 · answered Mar 29 '09 at 09:22

Possibly o/t but I quite often use nohup when ssh into a server on a long-running job so that if I get logged out the job still completes.

I wonder if people are confusing stopping and starting from a master interactive shell and spawning background processes? The wait command allows you to spawn a lot of things and then wait for them all to complete, and like I said I use nohup all the time. It's more complex than this and very underused - sh supports this mode too. Have a look at the manual.

You've also got

kill -STOP pid

I quite often do that if I want to suspend the currently running sudo, as in:

kill -STOP $$

But woe betide you if you've jumped out to the shell from an editor - it will all just sit there.

I tend to use mnemonic -KILL etc. because there's a danger of typing

kill - 9 pid # note the space

and in the old days you could sometimes bring the machine down because it would kill init!

OT but interesting info. Very very OT: is your name pronounced the same as "fish"? — system PAUSE, Mar 30 '09 at 14:28
Yes, my family name is Fish and I quite often use ghoti and see if anyone notices! — Ghoti, Mar 31 '09 at 17:49

score -1 · Answer 7 · answered May 24 '12 at 11:29

jobs DO work in bash scripts

BUT, you ... NEED to watch for the spawned staff like:

ls -1 /usr/share/doc/ | while read -r doc ; do ... done

jobs will have different context on each side of the |

bypassing this may be using for instead of while:

for `ls -1 /usr/share/doc` ; do ... done

this should demonstrate how to use jobs in a script ... with the mention that my commented note is ... REAL (dunno why that behaviour)

    #!/bin/bash


for i in `seq 7` ; do ( sleep 100 ) &  done

jobs

while [ `jobs | wc -l` -ne 0 ] ; do

    for jobnr in `jobs | awk '{print $1}' | cut -d\[ -f2- |cut -d\] -f1` ; do
        kill %$jobnr
    done
    #this is REALLY ODD ... but while won't exit without this ... dunno why
    jobs >/dev/null 2>/dev/null
done

sleep 1
jobs

Jobs do work, but the question is not about jobs, this is about job **control** (`fg`, `bg`, etc) — xhienne, Apr 21 '21 at 16:22

Why can't I use job control in a bash script?

7 Answers7

Linked

Related