4

This one has me flummoxed.

I'm creating a Bash script which copies files into a series of repos and then adds them for commit. The files sometimes have spaces in the filenames, so they need to be quoted.

I've created a quoted space-separated list of filenames in a variable in Bash: $x. When I run echo $x I get this:

'test 01.sql' 'test 02.sql' 'test_03.sql'

If I manually run the following (in the appropriate directory), I have no problem:

git add 'test 01.sql' 'test 02.sql' 'test_03.sql'

But in my script, if I run:

git add $x or git add "$x" or git add "${x}", I get a fatal pathspec error from Git.

fatal: pathspec ''test 01.sql' 'test 02.sql' 'test_03.sql'' did not match any files

I've tried both single and double quoted strings with no difference.

The example has been simplified. The full version uses absolute paths to the files.

'/Volumes/HardDrive/Repo/queries/test 01.sql' '/Volumes/HardDrive/Repo/queries/test 02.sql' '/Volumes/HardDrive/Repo/queries/test_03.sql'

It works when echoed from the script and pasted manually into the git add command, but doesn't work when passed from a variable in the script.

janos
  • 120,954
  • 29
  • 226
  • 236
Christopher Werby
  • 890
  • 2
  • 8
  • 15
  • To better understand why bash behaves the way it does, consider reading http://mywiki.wooledge.org/BashParser. Or, to make an immediately pertinent point: Quotes from variable expansion are always data, not syntax, as expansions occur only *after* syntax-level parsing is complete. (This is actually mandatory for correctness: Were it not so, it would be impossible to write shell scripts handling untrusted data safely). – Charles Duffy Dec 02 '15 at 21:08
  • http://mywiki.wooledge.org/BashFAQ/050 is also directly relevant. – Charles Duffy Dec 02 '15 at 21:10
  • By the way -- where does your "quoted space-separated list of filenames" come from? If you're building it via string manipulation, the code generating this list is all but certain to have subtle bugs that could be exploited with appropriately crafted filenames. (Filenames are allowed to contain literal quote characters, literal newlines -- anything but NULs; thus, a name containing literal `'` characters could potentially escape its quoting and run commands, aka `'"$(rm -rf .)"'`, were a naively quoted string `eval`'d). – Charles Duffy Dec 02 '15 at 21:12
  • Also, `echo $x` doesn't actually behave the way you expect it to, with the value for `x` in question. Run `printf '%q\n' $x` to see the list it actually evaluates to. – Charles Duffy Dec 02 '15 at 21:17
  • Your security comments are appreciated! This script is being used only on my local machine to tie SQL backup files to the Git repos for the projects that spawned them. The filenames all come from script generated file globs that originate in the SQL backup folder. I think I'm okay for this application. In the PHP world, eval is evil. So I try not to use it in Bash as well. Might be an over generalization. – Christopher Werby Dec 02 '15 at 22:09
  • The URLs that Charles Duffy provided are directly on point. I'm afraid I need to have him speak a bit slowly to me when it comes to how the Bash parser changes quote marks from syntactical to literal. Because they sure look syntactical when they are echoed. – Christopher Werby Dec 02 '15 at 22:13
  • `echo` emits the data it's passed: Anything that's syntactical was consumed by the shell before it ever got to `echo` -- just like how `echo "foo"` emits `foo`; the quotes were consumed before they reached `echo`. Thus, when `echo` emits quotes, that's a sure sign that those quotes were passed as data. – Charles Duffy Dec 02 '15 at 22:20
  • ..but going back to why `echo` is a poor choice to understand how things actually work: Output is exactly the same between `echo "foo bar"`, `echo foo bar`, and `echo "foo" "bar"`, despite the first of these having different semantics from the other two. You'll note that `printf '%q\n' "foo bar"` and `printf '%q\n' foo bar`, by contrast, are distinguishable. – Charles Duffy Dec 02 '15 at 22:21
  • And yes, `eval` is evil in bash too. There are safe ways to use it (using `printf '%q'` to have the shell quote contents itself), but they take a great deal of care; http://mywiki.wooledge.org/BashFAQ/048 goes into detail. – Charles Duffy Dec 02 '15 at 22:23
  • The printf '%q' approach is exactly what I needed to see the problem. It reveals that all the single quotes are escaped: `\'test\ 01.sql\'\ \'test\ 02.sql\'\ \'test_03.sql\'` That's not going to work! – Christopher Werby Dec 02 '15 at 22:27
  • *nod* -- having those values escaped in the output of `printf %q` shows that they were passed to it as literal data, as opposed to being parsed as syntax (compare to the output of `printf '%q\n' 'test 01.sql' 'test 02.sql' 'test 03.sql'`). `set -x` for bash also provides a similarly useful display distinguishing between syntactical and literal content, though not all other shells supporting `set -x` do the same. – Charles Duffy Dec 02 '15 at 22:31

3 Answers3

3

Instead of creating a concatenated string, use an array, for example:

arr=('test 01.sql')
arr+=('test 02.sql')
arr+=('test 03.sql')

Then you'll be able to add the files in the Bash array using:

git add "${arr[@]}"
janos
  • 120,954
  • 29
  • 226
  • 236
  • This works perfectly! I thought that `git add` wouldn't accept an array. I tested this approach before I went down the rabbit hole of converting my filename arrays into quoted strings. Apparently, Bash array syntax tripped me up once again! – Christopher Werby Dec 02 '15 at 22:01
  • 1
    No, git doesn't accept an array ;-) Using the writing style in my example, the shell expands the array to quoted argument list. Git received it as if you typed the arguments yourself manually on the command line – janos Dec 02 '15 at 22:35
  • Oh. I didn't understand that. I have to admit that I find the Bash array syntax particularly opaque, especially with when to quote or not quote, when to use [@] or not, or [*], when to use `$arr` or `${arr}` or `"${arr}"` or even just `arr`. The simplest thing has me heading for Google. I needed to subtract 1 from a variable and assign it to another variable. I ended up with the clearly suboptimal `result=$(( orig - 1 ))` (with all the white space characters being critically important). It's hard being a Bash newbie! Thank you for the help! – Christopher Werby Dec 02 '15 at 22:59
  • Everything's magic until you get it. It's not that hard to get Bash, and you're on the right track. It's worth knowing, extremely useful. But if your scripts get too complicated, then go with python or Ruby. In your last example, the only crucial spaces are around the = sign: there must be no spaces. That's about it. Inside the `$((...))` the syntax is much more relaxed, the spaces don't matter there at all – janos Dec 02 '15 at 23:10
  • @ChristopherWerby, `result=$(( orig - 1 ))` isn't actually all that suboptimal -- it's actually the preferred syntax as of modern POSIX sh. :) – Charles Duffy Dec 02 '15 at 23:10
  • 2
    @ChristopherWerby, ...as for `${foo}` vs `$foo`, the braces only matter if you need to disambiguate or parameterize the expansion; they're otherwise identical. `$foo` vs `"$foo"`, by contrast, matters deeply; the general rule is to always quote expansions unless you have a specific and compelling reason to do otherwise (which string-splits and glob-expands their results). For `"${arr[*]}"` vs `"${arr[@]}"`, the former creates a single string from your array (by putting the first character of `$IFS` between each element); the latter keeps the array elements as separate words. – Charles Duffy Dec 02 '15 at 23:13
  • 2
    @ChristopherWerby, ...so, absent a specific and compelling reason, always use `"${foo[@]}"`, not `${foo[*]}`, and `"$@"`, not `$*`. http://shellcheck.net/ is useful for catching a lot of the "practice X is almost never correct" scenarios. – Charles Duffy Dec 02 '15 at 23:14
2

A shell-quoted list of names is a very poor choice of formats to use for programmatic (as opposed to human) input.

This seems nonintuitive, but it's true for a reason: When you type in a command in the shell, that command is parsed as code; it's able to contain redirections, command substitutions, and other expansions with potentially dangerous side effects.

To allow data to be safely handled without any risk of evaluation as code, the shell performs parameter expansion only after most other parsing stages (exclusive of string splitting of expansion results and globbing) are complete.


If you were manually generating this input and reviewing it for correctness, you could use eval:

# THIS IS DANGEROUS unless you trust your string to contain no malicious content!
files="'test 01.sql' 'test 02.sql' 'test_03.sql'"
eval "git add -- $files"

However, if you're programmatically generating this list, format it as a NUL-delimited stream, and use xargs:

# generate a list in unambiguous NUL-delimited form
printf '%s\0' "/path/to/file 1" "/path/to/file 2" >file.txt

# use that list to run `git add` for the named files
xargs -0 git add -- <file.txt

...or a NUL-delimited stream can be read into a shell array:

# read that list into an array
files=( )
while IFS= read -r -d '' filename; do files+=( "$filename" ); done <files.txt

# ...and use the array
git add -- "${files[@]}"
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • I'm accepting this as the answer because it answers the original question, which was about passing a string. It also points out, as does janos, that the best approach is passing the array. Bash scripting seems like a dark art. It's surprising how hard it is to do seemingly simple things: assign an array to a variable, pass an array into a function, return an array from a function. Other Stack Overflow questions have helped out a lot, but I wonder if I shouldn't be shelling to Ruby or Python and doing hard stuff there? – Christopher Werby Dec 02 '15 at 22:16
0

Is it possible to simply do:

git add .

That way you don't need to specifically reference the filenames that may have spaces (but will stage everything)

Jonathan.Brink
  • 23,757
  • 20
  • 73
  • 115
  • Yes. I could also use git add -A. But my script needs to specifically target the files added to the commit so that I don't inadvertently commit other files which have intentionally not been staged. – Christopher Werby Dec 02 '15 at 21:55