7

In a bash script I have a variable containing a shell glob expression that I want to expand into an array of matching file names (nullglob turned on), like in

pat='dir/*.config'
files=($pat)

This works nicely, even for multiple patterns in $pat (e.g., pat="dir/*.config dir/*.conf), however, I cannot use escape characters in the pattern. Ideally, I would like to able to do

pat='"dir/*" dir/*.config "dir/file with spaces"'

to include the file *, all files ending in .config and file with spaces.

Is there an easy way to do this? (Without eval if possible.)

As the pattern is read from a file, I cannot place it in the array expression directly, as proposed in this answer (and various other places).

Edit:

To put things into context: What I am trying to do is to read a template file line-wise and process all lines like #include pattern. The includes are then resolved using the shell glob. As this tool is meant to be universal, I want to be able to include files with spaces and weird characters (like *).

The "main" loop reads like this:

    template_include_pat='^#include (.*)$'
    while IFS='' read -r line || [[ -n "$line" ]]; do
        if printf '%s' "$line" | grep -qE "$template_include_pat"; then
            glob=$(printf '%s' "$line" | sed -nrE "s/$template_include_pat/\\1/p")
            cwd=$(pwd -P)
            cd "$targetdir"
            files=($glob)
            for f in "${files[@]}"; do
                printf "\n\n%s\n" "# FILE $f" >> "$tempfile"
                cat "$f" >> "$tempfile" ||
                    die "Cannot read '$f'."
            done
            cd "$cwd"
        else
            echo "$line" >> "$tempfile"
        fi
    done < "$template"
steiny
  • 203
  • 3
  • 9
  • Another equivalent of `eval` : `source <(echo "files=($pat)")` – anishsane Jan 30 '18 at 15:14
  • @anishsane: Saw that just now. Wondering why they wouldn't want to use it – Inian Jan 30 '18 at 15:31
  • @anubhava I am using [the tenique described here](https://stackoverflow.com/a/10929511/5605853) to read files line-wise and then parse it with sed (I want to parse lines like '#include dir/*.config') – steiny Jan 30 '18 at 15:41
  • @anishsane Yeah, that works, but with the same security implications of eval... If Pattern is something like `$(echo GOTCHA >&2)` I certainly don't want to execute whats in `$(...)`. – steiny Jan 30 '18 at 15:51
  • This would fail if `pat` were something like `pat='dir a/*.config'`. Don't expect unquoted parameter expansions to ever do what you want. – chepner Jan 30 '18 at 16:16
  • You do realise that once you have dir/* all your other additions are already included, even you file with spaces – grail Jan 31 '18 at 01:45
  • "same security implications of eval": Exactly. That's why I said, it's just equivalent of `eval`. – anishsane Jan 31 '18 at 03:24

1 Answers1

0

Using the Python glob module:

#!/usr/bin/env bash

# Takes literal glob expressions on as argv; emits NUL-delimited match list on output
expand_globs() {
  python -c '
import sys, glob
for arg in sys.argv[1:]:
  for result in glob.iglob(arg):
    sys.stdout.write("%s\0" % (result,))
' _ "$@"
}

template_include_pat='^#include (.*)$'
template=${1:-/dev/stdin}

# record the patterns we were looking for
patterns=( )

while read -r line; do
  if [[ $line =~ $template_include_pat ]]; then
    patterns+=( "${BASH_REMATCH[1]}" )
  fi
done <"$template"

results=( )
while IFS= read -r -d '' name; do
  results+=( "$name" )
done < <(expand_globs "${patterns[@]}")

# Let's display our results:
{
  printf 'Searched for the following patterns, from template %q:\n' "$template"
  (( ${#patterns[@]} )) && printf ' - %q\n' "${patterns[@]}"
  echo
  echo "Found the following files:"
  (( ${#results[@]} )) && printf ' - %q\n' "${results[@]}"
} >&2
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • This is fine for relative pathes, but what about absolute ones? I.e. if the pattern reads `/etc/myconfig/*.config`... – steiny Jan 30 '18 at 21:39
  • Good point -- didn't realize that was covered. Amended to use the Python `glob` module instead. – Charles Duffy Jan 30 '18 at 22:17
  • Awesome, python is nice, too :) – steiny Jan 30 '18 at 22:17
  • I should probably start to write all my scripts in python... Thanks for your suggestion. What is still left: Handling of quotes, i.e., literal match for patterns like `dir/the file.config`. – steiny Jan 30 '18 at 22:28
  • `#include dir/the file.config` should work (matching `the file.config`) as-is with no changes whatsoever. – Charles Duffy Jan 30 '18 at 22:56
  • If you *really* want to allow limited quoting/escaping, it's possible -- see my answer at [bash: reading quoted/escaped arguments correctly from a string](https://stackoverflow.com/questions/26067249/bash-reading-quoted-escaped-arguments-correctly-from-a-string). – Charles Duffy Jan 30 '18 at 22:58
  • Yeah, I think python's shlex is the way to go. That way I can specify multiple globs on one line (separated with spaces) and match filenames with spaces. Although I doubt that I'll ever need to handle filenames with whitespace -- but who knows? – steiny Jan 30 '18 at 23:21