Bash splitting line with quotes into parameters

Question

hope this question did not occur before. At least I did not find an answer. Maybe didn't look well :(

Let's assume I got this piece of text:

hello "hello" "hello world"

Please tell me why these two scripts have different output?:

1) the text is saved in file

#!/bin/bash
while read line
    do
        set $line
        echo $1
        echo $2
        echo $3
    done < "file"

The output is:

hello
"hello"
"hello

2) text hardcoded in script

#!/bin/bash
set hello "hello" "hello world"
echo $1
echo $2
echo $3

Here's the output:

hello
hello
hello world

I would like to get the second behavior while reading line from different file. Please, stackers, help :(

This is closely related to BashFAQ #50: http://mywiki.wooledge.org/BashFAQ/050 — Charles Duffy, Aug 30 '16 at 17:58
BTW, `echo $1`; `echo $2`; `echo $3` is buggy -- look at what happens if you have `"*"` in your input. Always, *always*, **always** quote your expansions: `echo "$1"`, `echo "$2"`, `echo "$3"`. — Charles Duffy, Aug 30 '16 at 18:01
Related: http://stackoverflow.com/questions/38779095/appending-elements-specified-in-a-string-with-quotes-to-a-bash-array/38780049 — Charles Duffy, Aug 30 '16 at 18:03
Also related (perhaps enough so to be a duplicate): http://stackoverflow.com/questions/26067249/bash-reading-quoted-escaped-arguments-correctly-from-a-string — Charles Duffy, Aug 30 '16 at 18:04
BTW -- if you don't use `IFS=` on your `read`s, leading and trailing whitespace gets split, and if you don't use `-r`, then literal backslashes are parsed as directives to `read`. It's a good habit to use both by default unless you explicitly *want* the behaviors they suppress. — Charles Duffy, Aug 30 '16 at 18:08

John1024 · Answer 1 · 2016-08-30T18:47:42.123

This is a shell command:

set hello "hello" "hello world"

Because it is a shell command, the shell performs quote removal as the last step before executing the command.

Contrast that with the text in a file:

$ cat file
hello "hello" "hello world"

When the shell reads this file, it treats the quotes as just another character. The quotes in this file never appear directly on a command line and they are, consequently, not subject to quote removal.

Documentation

From the section in man bash discussing expansions:

Quote Removal

After the preceding expansions, all unquoted occurrences of the characters \, ', and " that did not result from one of the above expansions are removed.

How word splitting and quote removal interact

Bash does word splitting before it does quote removal. That is important, for example, for this command:

set hello "hello" "hello world"

When the shell does its word splitting, it finds three arguments to the set command. It is only as the last step before executing set that the shell does quote removal. Since no further word splitting is done, the number of arguments to set remains as three.

Let's contrast the above with the result of reading a line from the file:

$ cat file
hello "hello" "hello world"
$ read line <file
$ echo "$line"
hello "hello" "hello world"

As discussed above, the shell does no quote removal on the contents of line. Now, let's use $line as the argument to set:

$ set $line
$ echo $#
4

Four arguments are found. Those arguments are:

$ echo 1=$1 2=$2 3=$3 4=$4
1=hello 2="hello" 3="hello 4=world"

As you can see, the quotes in the file are treated as just plain ordinary characters, the same as, say, h or e. Consequently, "hello world" from the file is expanded as two words, each having one quote character.

Quite so. This would be a more complete answer, however, if it also addressed how quoting and variable interpolation interact with word splitting, which is also in play here. — John Bollinger, Aug 30 '16 at 18:06
@JohnBollinger OK. I added a section on the interaction of word splitting and quote removal. — John1024, Aug 30 '16 at 18:32
Thanks, I've learned a lot while studying materials you provided :) And thanks a lot for the tip with quotes while "echo'ing" variables! — zmaselke, Sep 06 '16 at 18:37

Charles Duffy · Accepted Answer · 2016-08-30T18:05:49.023

You can do a rough version of what you want using xargs, which parses quotes, backslashes, &c. in a roughly shell-equivalent manner:

#!/bin/bash
while IFS= read -r line; do
    xargs printf '%s\n' <<<"$line"
done <file

If you want to read those contents into positional arguments:

#!/bin/bash
while IFS= read -r line; do

    set --                                   # clear the argument list
    while IFS= read -r -d '' element; do     # read a NUL-delimited element
      set -- "$@" "$element"                 # append to the argument list
    done < <(xargs printf '%s\0' <<<"$line") # write NUL-delimited elements

    echo "$1"          # add quotes to make your code less buggy
    echo "$2"
    echo "$3"

done <file

That's the solution i've been looking for! Thanks a lot! – zmaselke Sep 06 '16 at 18:38 — zmaselke, Sep 06 '16 at 18:38

Bash splitting line with quotes into parameters

2 Answers2

Documentation

How word splitting and quote removal interact

Linked