3

In an answer to a question about piping and redirection, robert mentions that piping also captures the stdout of substituted processes in the pipeline, whilst redirection doesn't. Why is this so? What exactly is going on, that results in this behavior:

bash-4.1$ echo -e '1\n2' | tee >(head -n1) >redirect
1
bash-4.1$ cat redirect
1
2
bash-4.1$ echo -e '1\n2' | tee >(head -n1) | cat >pipe
bash-4.1$ cat pipe
1
2
1

I would've thought that both forms would produce the same result -- the latter one.

Reading an answer to a different question, it seemed plausible that reordering the redirect in the command might produce the desired result, but no matter the order, the result is always the same:

bash-4.1$ echo -e '1\n2' | tee >redirect >(head -n1)
1
bash-4.1$ cat redirect
1
2
bash-4.1$ echo -e '1\n2' | >redirect tee >(head -n1)
1
bash-4.1$ cat redirect
1
2

Why does the stdout redirect only affect tee, but pipe captures the substituted process head as well? Simply "By design"?


Just a thought related to the above question: I thought that redirecting to a file and piping the output would never make sense, but it does make sense with process substitution:

bash-4.1$ echo -e '1\n2\n3' | tee >(head -n1) >(tail -n1) >tee_out | cat >subst_out
bash-4.1$ cat tee_out
1
2
3
bash-4.1$ cat subst_out
1
3
Irfy
  • 9,323
  • 1
  • 45
  • 67
  • As an aside, `echo -e` is POSIX-specified to emit `-e` on output (and if bash has both `posix` and `xpg_echo` flags enabled, it will conform to the specification on this matter; this is one of the few places where its out-of-the-box behavior is not an *extension* of the standard but an outright *violation*, as said standard says in black and white that "implementations shall not support any options" -- sole exception being that `-n` makes behavior unspecified). Use `printf '%s\n' 1 2` instead to have something that works reliably on all POSIX-family shells. – Charles Duffy Feb 12 '18 at 15:43
  • (Also see the APPLICATION USAGE section of [the POSIX spec for `echo`](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/echo.html), which explains the historical reasons behind `echo`'s unusually loose specification, and explicitly suggests `printf` instead if either escape sequences or the `-n` argument are desired functionality). – Charles Duffy Feb 12 '18 at 15:45

3 Answers3

3

The shell that runs head is spawned by the same shell that runs tee, which means tee and head both inherit the same file descriptor for standard output, which file descriptor is connected to the pipe to cat. That means both tee and head have their output piped to cat, resulting in the behavior you see.

chepner
  • 497,756
  • 71
  • 530
  • 681
1

For

echo -e '1\n2' | tee >(head -n1) > redirect

, after |, only tee's stdout is redirected to the file and head still outputs to the tty. To redirect both tee and head's stdout you can write

echo -e '1\n2' | { tee >(head -n1); } > redirect

or

{ echo -e '1\n2' | tee >(head -n1); } > redirect

For

echo -e '1\n2' | tee >(head -n1) | cat > pipe

, tee >(head -n1) as a whole their stdout is piped to cat. It's logically the same as echo -e '1\n2' | { tee >(head -n1); } > redirect.

pynexj
  • 19,215
  • 5
  • 38
  • 56
0

TL;DR: When executing part of a pipeline, the shell performs pipe-redirection of stdin/stdout first and >/< redirection last. Command substitution happens in between those two, so pipeline-redirection of stdin/stdout is inherited, whilst >/< redirection is not. It's a design decision.


To be fair, I accepted chepner's answer because he was first and he was correct. However, I decided to add my own answer to document my process of understanding this issue by reading bash's sources, as chepner's answer doesn't explain why the >/< redirection isn't inherited.


It is helpful to understand the steps involved (grossly simplified), when a complex pipeline is encountered by the shell. I have simplified my original problem to this example:

$ echo x >(echo y) >file
y
$ cat file
x /dev/fd/63

$ echo x >(echo y) | cat >file
$ cat file
x /dev/fd/63
y

Redirection-only

When the shell encounters echo x >(echo y) >file, it first forks once to execute the complex command (this can be avoided for some cases, like builtins), and then the forked shell:

  1. creates a pipe (for process substitution)
  2. forks for second echo
    1. fork: connects its stdin to pipe[1]
    2. fork: exec's echo y; the exec'ed echo inherits:
      • stdin connected to pipe[1]
      • unchanged stdout
  3. opens file
  4. connects its stdout to file
  5. exec's echo x /proc/<pid>/fd/<pipe id>; the exec'ed echo inherits:
    • stdin unchanged
    • stdout connected to file

Here, the second echo inherits the stdout of the forked shell, before that forked shell redirects its stdout to file. I see no absolute necessity for this order of actions in this context, but I assume it makes more sense this way.

Pipe-Redirect

When the shell encounters echo x >(echo y) | cat >file, it detects a pipeline and starts processing it (without forking):

  1. parent: creates a pipe (corresponding to the only actual | in the full command)
  2. parent: forks for left side of pipe
    1. fork1: connects its stdout to pipe[0]
    2. fork1: creates a pipe_subst (for process substitution)
    3. fork1: forks for second echo
      1. nested-fork: connects its stdin to pipe_subst[1]
      2. nested-fork: exec's echo y; the exec'ed echo inherits:
        • stdin connected to pipe_subst[1] from the inner fork
        • stdout connected to pipe[0] from the outer fork
    4. fork1: exec's echo x /proc/<pid>/fd/<pipe_subst id>; the exec'ed echo inherits:
      • stdin unchanged
      • stdout connected to pipe[0]
  3. parent: forks for right side of pipe (this fork, again, can sometimes be avoided)
    1. fork2: connects its stdin to pipe[1]
    2. fork2: opens file
    3. fork2: connects its stdout to file
    4. fork2: exec's cat; the exec'ed cat inherits:
      • stdin connected to pipe[1]
      • stdout connected to file

Here, the pipe takes precedence, i.e. redirection of stdin/stdout due to the pipe is performed before any other actions take place in executing the pipeline elements. Thus both echo's inherit the stdout redirected to cat.


All of this is really a design-consequence of >file redirection being handled after process substitution. If >file redirection were handled before that (like pipe redirection is), then >file would also have been inherited by the substituted processes.

Irfy
  • 9,323
  • 1
  • 45
  • 67
  • *"All of this is really a design-consequence of >file redirection being handled after process substitution. [...]"* -- this i did agree though i did not follow your full analysis. `tee >(head -n1) >file` as a whole is run in a subshell. even `>file` is executed first it only affects `tee` and the subshell's stdout is not impacted (you can verify this with `sleep 10 >& file` on one tty and go to another tty and run `ls /proc//fd/`), then `>(head)` would inherit the subshells stdout which still outputs to current tty. – pynexj Feb 13 '18 at 15:52
  • How could applying `>file` redirection in the shell running `tee >(head -n1) >file` before forking for executing `head -n1` not affect that forked subshell's stdout? The redirected stdout would be inherited by the fork and exec'd `head -n1`. Just like redirection due to `|` is inherited because it happens before subshell forks. *I wrote a response first, will verify later ;)* – Irfy Feb 14 '18 at 11:12
  • I am talking about temporal *before* and *after*, not about positional, just to clarify. The shell must fork for `head -n1` before redirecting its output to `file`. The position of `>file` on the command-line is irrelevant for this. Was this the misunderstanding? – Irfy Feb 14 '18 at 11:22
  • no, i quite understand the postion of `>file` makes no difference. i mean, for `cmd > file`, the redir may happen after `fork()` but before `exec()`? – pynexj Feb 14 '18 at 11:42
  • The `>file` redir should always happen after `fork`, so as not to have the `>file` redirect in the fork (subshell). Then, when the `exec` happens (asynchronously) in relation to `>file` is of course irrelevant (separate processes, parent vs child). What I'm saying is, the shell has to be doing it this way, otherwise, if `>file` were applied before forking for the subshell handling command-substitution, that subshell would have inherited `>file` stdout redirection. – Irfy Feb 14 '18 at 14:10