0

I am trying to do a multi-line regex with a grep to find and remove certain ESLint comments from a javascript file (I don't think the fact that it is javascript is incredibly relevant, just providing context).

With the help of this question I learned about using -Pzo to use Perl regex, put null characters at the end of each match instead of a newline (essentially treat the whole file as one line), and only output the matching portions. Together, these options give me the following command: grep -Pzo '(?s)\s*\/\* globals .*?\*\/' "file.js"

When I run this command from a terminal, I get the output I expect. For example:


/* globals text text */
/* globals text text */
/* globals text text */
/* globals text text */
/* globals text */
/* globals text */
/* globals text */
/* globals text */
/* globals text */
/* globals ...
        ... */

I want to run this command in bash script and store the result in an array. From this question I learned how to create an array from the command output, and I came up with this script:

#!/bin/bash
matches=( $(grep -Pzo '(?s)\s*\/\* globals .*?\*\/' "file.js") )
for m in "${matches[@]}"; do
    echo "${m}"
done

But when I run this script, the output of the echo looks like this:

/bin
/boot
/dev
/etc
/home
/init
... more directories, then it repeats
/bin
...

Is there some subtle difference between how the grep is running in the terminal versus the script that I'm missing? Running bash --version from my terminal and in the script reports

GNU bash, version 5.0.17(1)-release (x86_64-pc-linux-gnu)
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
Jon G
  • 317
  • 2
  • 7

1 Answers1

2

It's this line:

matches=( $(grep -Pzo '(?s)\s*\/\* globals .*?\*\/' "file.js") )

Because the output of the command substitution is unquoted, the shell will apply word splitting and filename expansion (read Security implications of forgetting to quote a variable in bash/POSIX shells for all the details).

Clearly /* and */ will expand to a bunch of files.

To read the lines of output of a command into an array, use mapfile with input redirected from a process substitution:

mapfile -t matches < <(grep ...)
glenn jackman
  • 238,783
  • 38
  • 220
  • 352