13

What does a -d '' do in a bash read command? The example is directly from a previous SO. From the usage printed by the read command, it says that the -d option defines the delimiter for splitting words in a line. What does an empty delimiter do?

read -d '' sql << EOF
select c1, c2 from foo
where c1='something'
EOF

echo "$sql"

I know by experimenting that with it the variable is assigned the multiple lines. Without it, only the first line is assigned. It seems hard to explain this behavior based on the usage text.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
minghua
  • 5,981
  • 6
  • 45
  • 71
  • 3
    Empty delimiter option `-d ''` is for `NUL` delimited input. – anubhava Aug 14 '19 at 15:06
  • What does a `NUL` delimited input mean? [Inserting `'\0'` into the string](https://stackoverflow.com/questions/29885176/read-nul-delimited-fields)? – minghua Aug 14 '19 at 15:12
  • 1
    Yes `\0` is NUL character – anubhava Aug 14 '19 at 15:13
  • So basically: read the input until hit a `'\0'`, treat that part of input as a word, and assign it to the first variable. – minghua Aug 14 '19 at 15:16
  • Will the standard input actually give you a `'\0'` at the end of input? Or is it just some special convention on the read side? – minghua Aug 14 '19 at 15:18
  • It's interesting to see that a bash string [can contain `'\0'`](https://unix.stackexchange.com/questions/75186/how-to-do-head-and-tail-on-null-delimited-input-in-bash) [in it](https://stackoverflow.com/questions/29885176/read-nul-delimited-fields), [and not](https://unix.stackexchange.com/questions/523855/nul-delimited-variable). Will dig in more. – minghua Aug 14 '19 at 15:25
  • 1
    anubhava, maybe you can put this nul delimiter explanation into an answer with some elaborations. this is something new to me. guess some people do not know about it either. thanks for the quick answer. – minghua Aug 14 '19 at 15:28
  • At end of input, no, but there are other ways to put a NUL into stdin. `printf '%s\0' "first field" "second field" | some-script` puts NULs after `first field` and `second field` on some-script's stdin. – Charles Duffy Aug 14 '19 at 15:58
  • If there isn't a literal NUL, by the way, `read` will have a nonzero exit status; so if you're using this code with `set -e` ([which you absolutely shouldn't use](http://mywiki.wooledge.org/BashFAQ/105#Exercises)), your script will exit before it gets to the `echo`. – Charles Duffy Aug 14 '19 at 15:59
  • BTW, `-d ''` works because the first character in the argument passed to `-d` is the one used as the delimiter; for an empty string, the first character is the NUL delimiter that's after *all* C strings. – Charles Duffy Aug 14 '19 at 16:01
  • 4
    @minghua, ...huh? A bash string **absolutely cannot** contain a NUL literal, full-stop. `'\0'` is not a NUL literal; it's just an escape sequence that `printf` would turn into one. – Charles Duffy Aug 14 '19 at 16:01

2 Answers2

11

read -d changes the character that stops the read from the default newline to the first character of the following argument.

The important thing to understand is that bash uses C strings, which are terminated by literal NULs. Thus, when the following argument is '', the first (and only) character is the NUL terminating it; thus, when the shell dereferences the char* to get the first character it points to, it gets a NUL.


Now, when you redirect a heredoc with <<EOF, that document won't actually have any NULs in it -- so how does your code work?

The answer is that your code expects the read operation to fail. Even when it fails, read still populates its destination variable; so if you don't have a terminating delimiter, read has a nonzero exit status... but it still puts all the data you wanted to collect in the variable anyhow!

For a version that doesn't trigger set -e errors, consider checking whether the destination variable is empty after the read is complete:

{ IFS= read -r -d '' string || [[ $string ]]; } <<'EOF'
...string goes here...
EOF

What are the changes we made?

  • IFS= prevents leading or trailing whitespace (or other characters, should IFS have been redefined) from being stripped.
  • read -r prevents content with backslash literals from being mangled.
  • || [[ $string ]] means that if read reports a failure, we then check whether the string was populated, and still consider the overall command a success should the variable be non-empty.
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • This is helpful. Otherwise `read -d $'\0'` is incompatible with `set -e`. And read doesn't even have the decency to print an error message when it returns non-zero :-) – Iain Samuel McLean Elder Aug 27 '21 at 14:12
  • @IainSamuelMcLeanElder, ...mind, I'd argue that [nobody should _ever_ use `set -e`](https://mywiki.wooledge.org/BashFAQ/105#Exercises). – Charles Duffy Nov 08 '22 at 21:46
6

In bash read builtin empty string delimiter -d '' behaves same as using delimiter as a NUL byte or $'\0' (as defined by ANSI C-quoted string) or in hex representation 0x0.

-d '' specifies that each input line should be delimited by a NUL byte. It means that input string is read up to the immediate next NUL character in each invocation of read.

Usually it is used with IFS= as:

IFS= read -r -d ''

for trimming leading and trailing whitespaces in input.

A common example of processing NUL delimited input is:

while IFS= read -r -d '' file; do
    echo "$file"
done < <(find . -type f -print0)
  • find command is printing files in current directory with NUL as the delimiter between each entry.
  • read -d '' sets \0 as delimiter for reading one entry at a time from output of find command.

Related: Why ‘read’ doesn’t accept \0 as a delimiter in this example?

codeforester
  • 39,467
  • 16
  • 112
  • 140
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 1
    I'd argue that `$'\0'` is misleading, insofar as its use implies that it's generating something somehow different from `''`, and that bash is able to represent NUL literals in strings. – Charles Duffy Aug 14 '19 at 16:09
  • Yes I agree that use of `$'\0'` is misleading but behavior of `bash` is same whether we use `-d ''` or `-d $'0'` – anubhava Aug 14 '19 at 16:13