0

I am writing a script that searches for various strings in various folders . I need to run multiple greps for the strings in the order they appear in the array .

The variable reading from the array is : ${STRINGS_2_SEARCH[$j]

It should be something like this in a loop :

find ${FOLDERS_2_SEARCH[$i]} -type f -name "*.*" | \ 
   xargs zegrep -i ${STRINGS_2_SEARCH[1]  | \
   xargs zegrep -i ${STRINGS_2_SEARCH[2] ....... | \
   xargs zegrep -i ${STRINGS_2_SEARCH[n]

The | xargs zegrep -i ${STRINGS_2_SEARCH[j] need to be substituted according to the number of strings to serch I added to the array .

Regards Raz

Lev Levitsky
  • 63,701
  • 20
  • 147
  • 175
  • You could probably condense this into a single command. `find` accepts multiple paths to search and the regexes could trivially be merged into a single regex. If output order is important, postprocessing might be more efficient than running multiple `find`s. – tripleee Jan 02 '14 at 08:51

4 Answers4

0

another soloution could be to use awk for search. You would avoid multiple invocations of grep.

first step: build an awk search pattern from your array.

for px in "${STRINGS_2_SEARCH[@]}"
do
  patt="$patt;/$xx/"
done
awkpattern=${patt:1}

second step: pipe your output of find in a loop for searching

find .... | while read pth
do
  echo $pth | awk "$awkpattern"
done
tue
  • 497
  • 2
  • 8
0

Instead of multiple greps, create a single pattern by concatenating all your search strings and then use a single grep as shown below:

STRINGS_2_SEARCH=(foo bar baz)
PATTERN=$(IFS='|'; echo "${STRINGS_2_SEARCH[*]}")

find ${FOLDERS_2_SEARCH[$i]} -type f -name "*.*" | xargs zegrep -i "$PATTERN"

Alternatively, write another script called mygrep.sh and write a loop to grep for each pattern on a single file:

#!/bin/bash
# mygrep.sh
STRINGS_2_SEARCH=(foo bar baz)
FILE="$1"
for i in "${STRINGS_2_SEARCH[@]}"
do
    zegrep -i "${STRINGS_2_SEARCH[i]" "$FILE"
done

Then run:

find ${FOLDERS_2_SEARCH[$i]} -type f -name "*.*" -exec mygrep.sh {} \;
dogbane
  • 266,786
  • 75
  • 396
  • 414
  • 1
    Why not use `-exec` on your first example also instead of piping? – RedX Jan 02 '14 at 10:15
  • `xargs` is more efficient than `find ... -exec` for large numbers of files because it won't create a new child grep process for each file. – dogbane Jan 02 '14 at 13:53
  • But wouldn't piping cause problems if there are spaces in the filenames/paths? – RedX Jan 02 '14 at 14:59
  • if filenames could have newlines/spaces etc then you must use `find ... -print0 | xargs -0...` to use null-delimited filenames instead. – dogbane Jan 02 '14 at 15:31
0

You don't need find, only grep

shopt -s globstar
pattern=( -e 'example' -e 'multiple' -e 'on' )
files=( *.html *.log **/*.txt ); 
grep "${pattern[@]}" "${files[@]}"

Explanation

  1. globstar: allow recursive exploration
  2. pattern: store patterns in an array (also increase readibility in script)
  3. files: store files to grep in an array, then expand with ${files[@]}

-e PATTERN, --regexp=PATTERN

  Use PATTERN as the pattern.  This can be used to specify multiple 
  search patterns, or to protect a pattern beginning with a
  hyphen (-).  (-e is specified by POSIX.)

References

Édouard Lopez
  • 40,270
  • 28
  • 126
  • 178
  • this won't work if you have a large number of files. You will get an "argument list too long" error. – dogbane Jan 02 '14 at 13:58
  • The issue is not about the number of files, but the length of the expanded command. Moreover the `ARG_MAX` value is pretty high nowadays (> 2M): http://stackoverflow.com/a/18647755/802365 – Édouard Lopez Jan 02 '14 at 16:06
0

Here's a solution that makes use of awk to parse the list of strings that you are searching for into a regular expression that will match all of them. I.e. "foo bar" becomes "foo|bar". We then feed this string into grep. I have added the -Hn flags to grep which will display file name and line number along with grep results which could make parsing the results easier, depending on your use case. Finally grep is being called on each file that find locates via the -exec flag.

find ${FOLDERS_2_SEARCH[$i]} -type f -name "*.*" \
-exec egrep -Hn \
"$( awk '{for(i=1;i<NF;i++) { printf($i"|") }; printf($NF); printf("\n") }' \
<( echo ${STRINGS_2_SEARCH[$j]} ) )" {} \;
qwwqwwq
  • 6,999
  • 2
  • 26
  • 49