14

I want to call a function with a pipe, and read all of stdin into a variable.

I read that the correct way to do that is with read, or maybe read -r or read -a. However, I had a lot of problems in practise doing that (esp with multi-line strings).

In the end I settled on

function example () {
  local input=$(cat)
  ...
}

What is the idiomatic way to do this?

Paul Biggar
  • 27,579
  • 21
  • 99
  • 152
  • Do you need all of the input in a single string or do you need to operate on the pipe data in pieces? – Etan Reisner Sep 02 '15 at 22:07
  • 2
    have you tried something like `read -r -d\0 input` (assuming your input does not contain any null characters)? This will save one process per function call. I believe however that the `$(cat)` solution is more readable. – Michael Jaros Sep 02 '15 at 22:14
  • 2
    I think the idiomatic way to do this is to not do it. Instead, read stdin instead of using the variable where you intend to use the variable. If you need to reference the data multiple times, figure out a way to refactor the work so that is not necessary. How do you intend to use the variable? – William Pursell Sep 02 '15 at 22:34
  • @williamPursell I want to save this all up front because I want to do other things before processing it (such as calling other functions, declaring other vars, etc). – Paul Biggar Sep 02 '15 at 22:57
  • 1
    Unless the other things you plan on doing are going to consume standard input you can just do them first without worrying about `stdin` until you need it. And storing it in a variable will require storage memory for it all and consuming it all *before* your other work can even start. – Etan Reisner Sep 03 '15 at 00:07
  • What are you actually planning to do with the contents of `stdin`? – Etan Reisner Sep 03 '15 at 00:07
  • Questions about why you want to aside, this *is* the correct way to do it in those situations where you need/want to. – chepner Sep 03 '15 at 01:01
  • @etanreisner Unless I'm mistaken, in a bash function the stdin is passed to the first command in the function. This is why I "can't" do the other things first. Would be happy to learn my assumption is broken and that I in fact can use stdin later, but don't know how to do that. – Paul Biggar Sep 03 '15 at 01:45
  • You are mistaken. stdin exists in the function with the data being piped in, all processes/commands run in the function share that stdin fd (unless otherwise redirected). The first command that reads from it will get the data. Try `f() { echo "foo"; echo "bar"; cat; }; printf %s\\n one two three | f` for example. (Also this is why you can run `read` multiple times in a loop to get data from stdin line-by-line, etc.) – Etan Reisner Sep 03 '15 at 01:48
  • Thanks @etanReiser, that clears things up a lot. – Paul Biggar Sep 03 '15 at 06:08
  • @MichaelJaros, `-d ''` or `-d $'\0'`; `-d\0` is wrong. (I'd argue that `-d ''` is _least wrong_; `-d $'\0'` is misleading about how bash works, implying that it can represent a NUL literal inside a string when it can't actually do that) – Charles Duffy Jun 28 '23 at 17:30

3 Answers3

15

input=$(cat) is a perfectly fine way to capture standard input if you really need to. One caveat is that command substitutions strip all trailing newlines, so if you want to make sure to capture those as well, you need to ensure that something aside from the newline(s) is read last.

input=$(cat; echo x)
input=${input%x}   # Strip the trailing x

Another option in bash 4 or later is to use the readarray command, which will populate an array with each line of standard input, one line per element, which you can then join back into a single variable if desired.

readarray foo
printf -v foo "%s" "${foo[@]}"
chepner
  • 497,756
  • 71
  • 530
  • 681
  • According to [SC5145](https://www.shellcheck.net/wiki/SC2145), one should use `foo[*]` instead of `foo[@]` in a string. – lindhe Jul 25 '22 at 21:33
  • @lindhe, is "in a string" accurate anywhere here? `printf` takes an arbitrary number of arguments and repeats the format string as many times as necessary to consume them, so `foo=( one two three ); printf -v str %s "${foo[@]}"` will assign `str=onetwothree`, whereas with `"${foo[*]}"` it would be `str='one two three'` (with a default IFS). – Charles Duffy Jun 28 '23 at 17:32
8

I've found that using cat is really slow in comparison to the following method, based on tests I've run:

local input="$(< /dev/stdin)"

In case anyone is wondering, < is just input redirection. From the bash-hackers wiki (new link until fixed):

When the inner command is only an input redirection, and nothing else, for example

$( <FILE )
# or
` <FILE `

then Bash attempts to read the given file and act just if the given command was cat FILE.


Remarks about portability

In terms of how portable this method is, you are likely to go your entire linux user career, and never use a linux system which doesn't have /dev/stdin, but in case you want to satisfy that itch, here is a question on Unix Stackexchange which questions portability of directly accessing /dev/{stdin,stdout,stderr} and friends.

One more thing I've come across when working with linux containers such as ones built with docker, or buildah, is that there are situations where /dev/stdin or even /dev/stdout are not available inside the container. I've not been able to conclusively say what causes this.

smac89
  • 39,374
  • 15
  • 132
  • 179
0

There are a few overlapping / very similar questions floating around on SO. I answered this here, using the read built-in:

https://stackoverflow.com/a/58452863/3220983

In my answers there, however, I am ONLY concerned with a single line.

The arguable weakness of the cat approach, is that requires spawning a subshell. Otherwise, it's a good one. It's probably the easiest way to deal with multi line processing, as specifically queried here.

I think the read approach is faster / more resource efficient if you are trying to chain a lot of commands, or iterate through a list calling a function repeatedly.

BuvinJ
  • 10,221
  • 5
  • 83
  • 96