2

I would like to run grep on Mac OS X that would meeting the following criteria:

  • search all files with *.R or *.r as extension and ignore other files
  • Find strings: wordA and wordB accounting for the fact that the strings may appear in the format someRubbishWordARubbish (this is a valid match)
  • List only the files where both strings appear irrespectively of the order
  • Print the lines where the strings appear
  • Highlight the found words in colour
  • Print the file name as a header and lines under the header. I'm inspired by the ack options.
  • Ignoring the case

Approach

I was thinking of making use of this discussion and starting with the following grep syntax:

grep --include=*.R -r setHeader .

Then combining it with the following:

grep 'word1\|word2\|word3' /path

However, I would appreciate comments on ensuring that all of the criteria stated above will be evaluated correctly.

Groups

^(.*)(facet|map)(.*)(map|facet)(.*)$

regex101


Ack

Running ack -f shows that *.R files would be searched so solutions using ack will be accepted. For example, running:

ack wordA --colour -i -H --rr

gets the desired results with respect to the wordA. I was thinking of combining it the solutions discussed here but I would like to use AND not OR and ignore the order in which strings may appear. I further tried:

ack --match wordA --match wordB  --colour -i -H --rr

But this produced only the results for wordB.

Community
  • 1
  • 1
Konrad
  • 17,740
  • 16
  • 106
  • 167

2 Answers2

2

Well, this is pretty inefficient, but assuming you aren't searching a large directory, this should work.

find . -iname "*.r"  | while read file
do 
if (grep -qi "wordB" $file && grep -qi "wordA" $file)
  then
  echo "======== $file ======="
  grep --color=auto -iE "wordA|wordB" $file
  fi
done
PapaBuduiit
  • 156
  • 1
  • 4
  • Thanks very much for contributing the solution. I may be running it on a large directory, but any solution is better than no solution. I think I'll save it as a bash script as it will be handy. I'm wondering, would it be possible to make use of *--include* in *grep* instead of running the *find* at the beginning? – Konrad Feb 08 '16 at 21:14
2

Here's a script that combines the closely-named awk and ack commands:

find . -iname '*.r' | while read file; do
    awk '
        BEGIN { IGNORECASE=1; sawWordA = 0; sawWordB = 0 }
        /wordA/ { sawWordA = 1 }
        /wordB/ { sawWordB = 1 }
        sawWordA && sawWordB { exit } # stop reading lines if both matches seen
        END { exit !(sawWordA && sawWordB) }
        ' \
        "${file}" \
    && ack --nofilter -H -i 'wordA|wordB' "${file}"
done

The awk command...

  • Lists only the files where both strings appear irrespectively of the order
  • Ignores the case

...and the ack command...

  • Prints the lines where the strings appear
  • Highlights the found words in colour
  • Prints the file name as a header and lines under the header inspired by the ack options
  • Ignores the case

The awk script sets flags if there are search string matches. If both strings have been matched, then the snippet exit !(sawWordA && sawWordB) will return 0. If awk returns 0, then the ack command runs.

The ack --nofilter option tells ack to avoid reading from STDIN. Otherwise, ack would try to use the STDIN that the read command is using.

In the comments, Konrad asked how to use the above code when passing in variables in a shell script. Below is an example:

#!/bin/sh

if [ $# -ne 2 ]; then
    echo Usage: $0 {string1} {string2}
    E_BADARGS=65
    exit $E_BADARGS
fi

find . -maxdepth 1 -iname '*.r' | while read file; do
    awk "
        BEGIN { IGNORECASE=1; sawArg1 = 0; sawArg2 = 0 }
        /$1/ { sawArg1 = 1 }
        /$2/ { sawArg2 = 1 }
        sawArg1 && sawArg2 { exit } # stop reading lines if both matches seen
        END { exit !(sawArg1 && sawArg2) }
        " \
        "${file}" \
    && ack --nofilter -H -i "$1|$2" "${file}"
done

The above example doesn't escape any special characters in the arguments provided to the script. If escaping is needed, the script can be modified as needed.

Chad Nouis
  • 6,861
  • 1
  • 27
  • 28
  • How would I modify this script if I would like to call it in [tag:bash] and pass parameters to `wordA` and `wordB`. I reckon that I would have to do `"$1"` and `"$2"` but it should look for the syntax `wordA|wordB`? – Konrad Feb 10 '16 at 11:44
  • @Konrad I've added example code for passing variables into a shell script. – Chad Nouis Feb 10 '16 at 16:23
  • Thanks very much, it's a really useful tool. – Konrad Feb 10 '16 at 16:50