Two important pitfalls
which were ignored by other answers so far:
- Trailing newline removal from command expansion
- NUL character removal
Trailing newline removal from command expansion
This is a problem for the:
value="$(cat config.txt)"
type solutions, but not for read
based solutions.
Command expansion removes trailing newlines:
S="$(printf "a\n")"
printf "$S" | od -tx1
Outputs:
0000000 61
0000001
This breaks the naive method of reading from files:
FILE="$(mktemp)"
printf "a\n\n" > "$FILE"
S="$(<"$FILE")"
printf "$S" | od -tx1
rm "$FILE"
POSIX workaround: append an extra char to the command expansion and remove it later:
S="$(cat $FILE; printf a)"
S="${S%a}"
printf "$S" | od -tx1
Outputs:
0000000 61 0a 0a
0000003
Almost POSIX workaround: ASCII encode. See below.
NUL character removal
There is no sane Bash way to store NUL characters in variables.
This affects both expansion and read
solutions, and I don't know any good workaround for it.
Example:
printf "a\0b" | od -tx1
S="$(printf "a\0b")"
printf "$S" | od -tx1
Outputs:
0000000 61 00 62
0000003
0000000 61 62
0000002
Ha, our NUL is gone!
Workarounds:
ASCII encode. See below.
use bash extension $""
literals:
S=$"a\0b"
printf "$S" | od -tx1
Only works for literals, so not useful for reading from files.
Workaround for the pitfalls
Store an uuencode base64 encoded version of the file in the variable, and decode before every usage:
FILE="$(mktemp)"
printf "a\0\n" > "$FILE"
S="$(uuencode -m "$FILE" /dev/stdout)"
uudecode -o /dev/stdout <(printf "$S") | od -tx1
rm "$FILE"
Output:
0000000 61 00 0a
0000003
uuencode and udecode are POSIX 7 but not in Ubuntu 12.04 by default (sharutils
package)... I don't see a POSIX 7 alternative for the bash process <()
substitution extension except writing to another file...
Of course, this is slow and inconvenient, so I guess the real answer is: don't use Bash if the input file may contain NUL characters.