4

I'm dealing with a script, that invokes a noisy (lots of diagnostics on both stdout and stderr) program first, and then processes its output with other tools.

The program's verbosity makes it impossible to simply send its stdout to pipeline, so currently we use a temporary file -- a practice I'd like to end.

Instead of /tmp/foo, we can ask the program to write the data to /dev/fd/N -- and it will, no problem (it does not need to seek the file, for example).

What noise it currently sends to stdout and stderr, can continue going there -- the operators are used to seeing it, and will be alarmed, if it disappears...

But how do I arrange for the descriptor N to exist and be sent into the next program's stdin?

noisy -o /dev/fd/N ?????| filter -i /dev/stdin

If this requires bash, so be it, but I'd prefer a solution suitable for the entire sh-family, of course.

Mikhail T.
  • 3,043
  • 3
  • 29
  • 46
  • I think you should keep using a temporary file. – oguz ismail Feb 06 '21 at 19:07
  • 5
    Temporary files are bad. They are less efficient, and they litter the filesystems -- code required to reliably clean them is uglier than any answer to my question will be. They may be used by someone _in a hurry_ -- too busy to do things _right_. But to encourage their use, as you do, is even more wrong than using them. – Mikhail T. Feb 06 '21 at 19:39
  • 3
    Well, good luck then – oguz ismail Feb 06 '21 at 19:45
  • Could you use a named pipe, as in [Example of using named pipes in Linux Bash](https://stackoverflow.com/questions/4113986/example-of-using-named-pipes-in-linux-bash)? – Shane Bishop Feb 06 '21 at 20:17
  • A more fruitful approach might be to write a wrapper which suppresses the noise and only keeps the useful output. If the noise is predictable, it could be as simple as a single `grep`. – tripleee Feb 06 '21 at 20:29
  • @ShaneBishop, though named pipes don't have the efficiency problem of temporary files, they are still littering the filesystems. I've accept the answer by pjh with gratitude -- everyone should understand it to make their shell scripts better. – Mikhail T. Feb 08 '21 at 17:12
  • Temporary files are sometimes the best option, and sometimes unavoidable. See [Removing created temp files in unexpected bash exit](https://stackoverflow.com/q/687014) for clean, reliable, and safe ways to create them and ensure that they are cleaned up (`mktemp`, `trap ... EXIT`). Programs that use temporary files can also be easier to debug because intermediate results can be examined. – pjh Feb 09 '21 at 09:50
  • @pjh, `trap` will not help, if an impatient operator kills your script with `kill -9`. Nor is `trap` any better-looking, than the mechanism you provided in your answer -- while the inefficiency is still there. Kernel does not know, your file is temporary -- it still has to sync the data to filesystem. The only valid observation is ease of debugging -- but for that one can simply insert `tee /tmp/temp` in front of the `| filter`. _Temporarily_ -- to be removed, when the debugging is over. – Mikhail T. Feb 09 '21 at 16:31

2 Answers2

7

If I understand your problem correctly, you've got a program that writes noise to standard output and standard error, and writes useful data to a file specified with a -o option. You want standard output and standard error to be left as they are, but pipe the useful data into a filter program instead of writing it to file.

The easiest way to do that with Bash is to use process substitution (see ProcessSubstitution - Greg's Wiki):

noisy -o >(filter -i /dev/stdin)

Note that process substitution is not available in some sh-family shells, it is not available with Bash on some (uncommon) platforms, and there is no way to get the exit status of a process created with process substitution with Bash before version 4.4.

Another possible way to do what (I think) you want is:

exec 3>&1
{ exec 4>&1; noisy -o /dev/fd/4 >&3 ; } | filter -i /dev/stdin
  • exec 3>&1 makes file descriptor 3 refer to the "real" standard output.
  • exec 4>&1 (since it is run in a process that is the first stage of a pipeline) makes file descriptor 4 refer to the input to the next stage in the pipeline.
  • noisy ... >&3 forces the standard output of noisy to go to the "real" standard output.
  • Writing to /dev/fd/4 (on Linux at least) writes to the next stage in the pipeline.

I've only tested it with Bash, but I think it should work with other sh-family shells.

Mikhail T.
  • 3,043
  • 3
  • 29
  • 46
pjh
  • 6,388
  • 2
  • 16
  • 17
  • Thanks for the solution! Honestly, I don't understand, how to make my question any clearer -- but I'm glad, you got it :) – Mikhail T. Feb 08 '21 at 17:05
0

Here is one possible way:

( \
  /my/noisy/script \
    2> >(tee /tmp/stderr.log) \
    1> >(tee /tmp/stdout.log) \
) 2>&1 | tee /tmp/both.log

Replace the tee calls with whatever filter you like.

KJ7LNW
  • 1,437
  • 5
  • 11