What's the difference between my_command < file.txt
and cat file.txt | my_command
?
my_command < file.txt
The redirection symbol can also be written as 0<
as this redirects file descriptor 0 (stdin
) to connect to file.txt
instead of the current setting, which is probably the terminal. If my_command
is a shell built-in then there are NO child processes created, otherwise there is one.
cat file.txt | my_command
This redirects file descriptor 1 (stdout
) of the command on the left to the input stream of an anonymous pipe, and file descriptor 0 (stdin
) of the command on the right to the output stream of the anonymous pipe.
We see at once that there is a child process, since cat
is not a shell built-in. However in bash
even if my_command
is a shell builtin it is still run in a child process. Therefore we have TWO child processes.
So the pipe, in theory, is less efficient. Whether that difference is significant depends on many factors, including the definition of "significant". The time when a pipe is preferable is this alternative:
command1 > file.txt
command2 < file.txt
Here it is likely that
command1 | command2
is more efficient, remembering that, in practice, we will probably need a third child process in rm file.txt
.
However, there are limitations to pipes. They are not seekable (random access, see man 2 lseek
) and they cannot be memory mapped (see man 2 mmap
). Some applications map files to virtual memory, but it would be unusual to do that to stdin
or stdout
. Memory mapping in particular is not possible on a pipe (whether anonymous or named) because a range of virtual addresses has to be reserved and for that a size is required.
Edit:
As mentioned by @JohnKugelman, a common error and source of many SO questions is the associated issue with a child process and redirection:
Take a file file.txt
with 99 lines:
i=0
cat file.txt|while read
do
(( i = i+1 ))
done
echo "$i"
What gets displayed? The answer is 0
. Why? Because the count i = i + 1
is done in a subshell which, in bash
, is a child process and does not change i
in the parent (note: this does not apply to korn shell, ksh
).
while read
do
(( i = i+1 ))
done < file.txt
echo "$i"
This displays the correct count because no child processes are involved.