TL;DR
I'd like to understand why the yes
command works properly with most tools and scripts that read from standard input, but fails to work with Bash's own read builtin except when using process substitution or a complex set of shell options. I find this behavior surprising and poorly-documented, although I think it's related to the way that Bash pipelines typically create subshells.
Bash's Read Builtin
I'm using Bash 5.1.16(1)-release on macOS 11.6.3. The yes
command is therefore from BSD, but I'm seeing the same behavior on various Linux systems. Specifically, the output of yes
can be successfully piped into shell scripts and tools that read from standard input, but for some reason I can't get it to populate a variable using the Bash read
builtin. Since yes
uses standard output, and read
defaults to standard input, I'd expect the following to populate the builtin's default REPLY variable:
yes | read
echo "$REPLY"
However, REPLY isn't even set:
$ declare -p REPLY
bash: declare: REPLY: not found
Assuming the problem is the delimiter doesn't seem to help, and isn't borne out by the line-oriented tests in the code immediately below. If it were the lack of a newline, either of the following character-oriented options should work:
$ yes | read -n 1; declare -p REPLY
$ yes | read -N 1; declare -p REPLY
but again, in both cases Bash reports bash: declare: REPLY: not found
.
Please note that the problem is the same even if I explicitly define a variable to populate. It isn't an issue with read's default REPLY variable; it seems to be an issue with the way that the builtin expects to get input.
Process Substitution, Some Complex Commands, and Non-Builtins Work Fine
On the other hand, Bash's process substitution works just fine:
$ read < <(yes)
$ echo "$REPLY"
y
Why would it work with process substitution, but not with a simple pipe? It also sort of works if I try to access REPLY from within a complex command. For example, after being sure to unset the REPLY variable with unset REPLY
:
$ unset REPLY
$ yes | { read; echo "$REPLY"; }
y
$ declare -p REPLY
bash: declare: REPLY: not found
Obviously, it also works as expected with other tools that take standard input. For example, using Perl or Ruby:
$ yes | perl -ne 'print; exit'
y
$ yes | ruby -nle 'pp $_; exit'
"y"
Partial Answer from Related Question
Finally, based on a comment buried within a related question, it looks like you can make a standard(ish) pipeline work if you:
- disable job control with the
set
builtin, and - enable the shell's lastpipe option with
shopt
.
For example:
$ shopt -s lastpipe; \
set +m; \
unset REPLY; \
yes | read; \
echo "$REPLY"
y
At least this defines the problem as a subshell-related issue rather than an issue with standard input, but it doesn't really explain why the limitation exists or what exactly job control has to do with this. If this is expected and foundational behavior for Bash, it's not really intuitive, and I'd appreciate a better explanation (if one exists) for the semantics of this.