119

I currently have a script that does something like

./a | ./b | ./c

I want to modify it so that if any of a, b, or c exit with an error code I print an error message and stop instead of piping bad output forward.

What would be the simplest/cleanest way to do so?

Josh Correia
  • 3,807
  • 3
  • 33
  • 50
hugomg
  • 68,213
  • 24
  • 160
  • 246
  • 9
    There really needs to be something like `&&|` which would mean "only continue the pipe if the preceding command was successful". I suppose you could also have `|||` which would mean "continue the pipe if the preceding command failed" (and possibly pipe the error message like Bash 4's `|&`). – Dennis Williamson Oct 14 '09 at 07:28
  • 9
    @DennisWilliamson, you can't "stop the pipe" because `a`, `b`, `c` commands are not run sequentially but in parallel. In other words, data flows sequentially from `a` to `c`, but the actual `a`, `b` and `c` commands start (roughly) at the same time. – Giacomo Jun 04 '12 at 11:09

5 Answers5

183

In bash you can use set -e and set -o pipefail at the beginning of your file. A subsequent command ./a | ./b | ./c will fail when any of the three scripts fails. The return code will be the return code of the first failed script.

Note that pipefail isn't available in standard sh.

Markus Unterwaditzer
  • 7,992
  • 32
  • 60
Michel Samia
  • 4,273
  • 2
  • 24
  • 24
  • I didn't know about pipefail, really handy. – Phil Jackson Jun 19 '13 at 14:02
  • It is intended to put it into a script, not in the interactive shell. Your behaviour can be simplified to set -e; false; it also exits the shell process, which is expected behaviour ;) – Michel Samia Oct 10 '13 at 14:43
  • 17
    Note: this will still execute all three scripts, and not stop the pipe on first error. – hyde Oct 16 '13 at 06:58
  • @hyde, care to explain? I tried it, and it doesn't seem to execute all three scripts. It does seem to stop the pipe on first error. – Gui Prá Nov 01 '14 at 14:20
  • @n2liquid-GuilhermeVieira Play with different variations of this to understand: `(set -e ; set -o pipefail ; (echo part1 ; exit 1) | (echo sleep ; sleep 1 ; echo part2; read line ; echo $line) ; echo pipe result $? ) ` – hyde Nov 01 '14 at 18:33
  • Wow, so confusing. I'll have to read more about that later. Thanks, @hyde. – Gui Prá Nov 01 '14 at 20:09
  • 1
    @n2liquid-GuilhermeVieira Btw, by "different variations", I specifically meant, remove one or both of the `set` (for 4 different versions total), and see how that affects the output of the last `echo`. – hyde Nov 01 '14 at 20:33
  • 1
    @josch I found this page when searching for how to do this in bash from google, and though, "This is exactly what I am looking for". I suspect many people who upvote answers go through a similar thought process and do not check the tags and the definitions of the tags. – Troy Daniels Jan 30 '17 at 19:17
59

You can also check the ${PIPESTATUS[]} array after the full execution, e.g. if you run:

./a | ./b | ./c

Then ${PIPESTATUS} will be an array of error codes from each command in the pipe, so if the middle command failed, echo ${PIPESTATUS[@]} would contain something like:

0 1 0

and something like this run after the command:

test ${PIPESTATUS[0]} -eq 0 -a ${PIPESTATUS[1]} -eq 0 -a ${PIPESTATUS[2]} -eq 0

will allow you to check that all commands in the pipe succeeded.

sanmai
  • 29,083
  • 12
  • 64
  • 76
Imron
  • 782
  • 5
  • 5
  • 13
    This is bashish --- it's a bash extension and not part of the Posix standard, so other shells like dash and ash won't support it. This means you can run into trouble if you try to use it in scripts which start `#!/bin/sh`, because if `sh` isn't bash, it won't work. (Easily fixable by remembering to use `#!/bin/bash` instead.) – David Given Sep 19 '14 at 23:24
  • 3
    `echo ${PIPESTATUS[@]} | grep -qE '^[0 ]+$'` - returns 0 if `$PIPESTATUS[@]` only contains 0 and spaces (if all commands in the pipe succeeds). – MattBianco Oct 14 '16 at 07:28
  • @MattBianco This is the best solution for me. It also works with && e.g. `command1 && command2 | command3` If any of those fail your solution returns non-zero. – DavidC Sep 11 '19 at 10:42
24

If you really don't want the second command to proceed until the first is known to be successful, then you probably need to use temporary files. The simple version of that is:

tmp=${TMPDIR:-/tmp}/mine.$$
if ./a > $tmp.1
then
    if ./b <$tmp.1 >$tmp.2
    then
        if ./c <$tmp.2
        then : OK
        else echo "./c failed" 1>&2
        fi
    else echo "./b failed" 1>&2
    fi
else echo "./a failed" 1>&2
fi
rm -f $tmp.[12]

The '1>&2' redirection can also be abbreviated '>&2'; however, an old version of the MKS shell mishandled the error redirection without the preceding '1' so I've used that unambiguous notation for reliability for ages.

This leaks files if you interrupt something. Bomb-proof (more or less) shell programming uses:

tmp=${TMPDIR:-/tmp}/mine.$$
trap 'rm -f $tmp.[12]; exit 1' 0 1 2 3 13 15
...if statement as before...
rm -f $tmp.[12]
trap 0 1 2 3 13 15

The first trap line says 'run the commands 'rm -f $tmp.[12]; exit 1' when any of the signals 1 SIGHUP, 2 SIGINT, 3 SIGQUIT, 13 SIGPIPE, or 15 SIGTERM occur, or 0 (when the shell exits for any reason). If you're writing a shell script, the final trap only needs to remove the trap on 0, which is the shell exit trap (you can leave the other signals in place since the process is about to terminate anyway).

In the original pipeline, it is feasible for 'c' to be reading data from 'b' before 'a' has finished - this is usually desirable (it gives multiple cores work to do, for example). If 'b' is a 'sort' phase, then this won't apply - 'b' has to see all its input before it can generate any of its output.

If you want to detect which command(s) fail, you can use:

(./a || echo "./a exited with $?" 1>&2) |
(./b || echo "./b exited with $?" 1>&2) |
(./c || echo "./c exited with $?" 1>&2)

This is simple and symmetric - it is trivial to extend to a 4-part or N-part pipeline.

Simple experimentation with 'set -e' didn't help.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • 2
    I would recommend using `mktemp` or `tempfile`. – Dennis Williamson Oct 11 '09 at 19:40
  • @Dennis: yes, I suppose I should get used to commands such as mktemp or tmpfile; they didn't exist at the shell level when I learned it, oh so many years ago. Let's do a quick check. I find mktemp on MacOS X; I have mktemp on Solaris but only because I've installed GNU tools; it seems that mktemp is present on antique HP-UX. I'm not sure whether there's a common invocation of mktemp that works across platforms. POSIX standardizes neither mktemp nor tmpfile. I didn't find tmpfile on the platforms I have access to. Consequently, I won't be able to use the commands in portable shell scripts. – Jonathan Leffler Oct 11 '09 at 21:06
  • 1
    When using `trap` keep in mind that the user can always send `SIGKILL` to a process to immediately terminate it, in which case your trap will not take effect. Same goes for when your system experiences power outage. When creating temporary files, make sure to use `mktemp` because that will put your files somewhere, where they are cleaned up after a reboot (usually into `/tmp`). – josch Jun 20 '16 at 08:15
10

Unfortunately, the answer by Johnathan requires temporary files and the answers by Michel and Imron requires bash (even though this question is tagged shell). As pointed out by others already, it is not possible to abort the pipe before later processes are started. All processes are started at once and will thus all run before any errors can be communicated. But the title of the question was also asking about error codes. These can be retrieved and investigated after the pipe finished to figure out whether any of the involved processes failed.

Here is a solution that catches all errors in the pipe and not only errors of the last component. So this is like bash's pipefail, just more powerful in the sense that you can retrieve all the error codes.

res=$( (./a 2>&1 || echo "1st failed with $?" >&2) |
(./b 2>&1 || echo "2nd failed with $?" >&2) |
(./c 2>&1 || echo "3rd failed with $?" >&2) > /dev/null 2>&1)
if [ -n "$res" ]; then
    echo pipe failed
fi

To detect whether anything failed, an echo command prints on standard error in case any command fails. Then the combined standard error output is saved in $res and investigated later. This is also why standard error of all processes is redirected to standard output. You can also send that output to /dev/null or leave it as yet another indicator that something went wrong. You can replace the last redirect to /dev/null with a file if yo uneed to store the output of the last command anywhere.

To play more with this construct and to convince yourself that this really does what it should, I replaced ./a, ./b and ./c by subshells which execute echo, cat and exit. You can use this to check that this construct really forwards all the output from one process to another and that the error codes get recorded correctly.

res=$( (sh -c "echo 1st out; exit 0" 2>&1 || echo "1st failed with $?" >&2) |
(sh -c "cat; echo 2nd out; exit 0" 2>&1 || echo "2nd failed with $?" >&2) |
(sh -c "echo start; cat; echo end; exit 0" 2>&1 || echo "3rd failed with $?" >&2) > /dev/null 2>&1)
if [ -n "$res" ]; then
    echo pipe failed
fi
josch
  • 6,716
  • 3
  • 41
  • 49
  • You raise an interesting point about when a pipeline list gets started. Is there a way for an intermediate list to check on the completion of a prior (or even a successor) ? And further, to take action if the prior completes or fails. Seems hard to do well, if at all. e.g. I frequently will add `head` at the end of the pipe, in partial hopes of forcing the prior to terminate after doing 'just enough' work. The idea is to slightly leverage the pipeline characteristics, not to provide an alternative approach to – ShpielMeister Aug 13 '22 at 21:05
  • @ShpielMeister The processes in the pipeline are connected via their stdin and stdout. That's how they can communicate and unless you give them more information in form of command line arguments, that's the only way they can communicate with each other. If your prior fails to terminate after doing 'just enough work' then you have to fix that program so that it stops to produce output after its output buffer is full because your `head` stopped reading. For example python does the right thing automatically: `python3 -c '[print(20*"x") for x in range(2000)]' | head` – josch Aug 16 '22 at 02:43
  • `$ python3 -c '[print(20*"x") for x in range(300)]' | head -1` xxxxxxxxxxxxxxxxxxxx `$ python3 -c '[print(20*"x") for x in range(3000)]' | head -1` xxxxxxxxxxxxxxxxxxxx Traceback (most recent call last): File "", line 1, in File "", line 1, in BrokenPipeError: [Errno 32] Broken pipe Exception ignored in: <_io.TextIOWrapper name='' mode='w' encoding='utf-8'> BrokenPipeError: [Errno 32] Broken pipe` .... Could signals be used? Seems messy. `$ yes | head -2` y y `$ yes | head -200000000 | tail -1 # take a long time` `^C` – ShpielMeister Aug 16 '22 at 07:14
1

This answer is in the spirit of the accepted answer, but using shell variables instead of temporary files.

if TMP_A="$(./a)"
then
 if TMP_B="$(echo "TMP_A" | ./b)"
 then
  if TMP_C="$(echo "TMP_B" | ./c)"
  then
   echo "$TMP_C"
  else
   echo "./c failed"
  fi
 else
  echo "./b failed"
 fi
else
 echo "./a failed"
fi
Jasha
  • 5,507
  • 2
  • 33
  • 44