12

Need to search a directories with lots of sub-directories for a string inside files:

I'm using:

grep -c -r "string here" *

How can I total count of finds?

How can I output to file only those files with at least one instance?

Codex73
  • 5,690
  • 11
  • 56
  • 76

7 Answers7

10

Using Bash's process substitution, this gives what I believe is the output you want? (Please clarify the question if it's not.)

grep -r "string here" * | tee >(wc -l)

This runs grep -r normally, with output going both to stdout and to a wc -l process.

ephemient
  • 198,619
  • 38
  • 280
  • 391
9

It works for me (it gets the total number of 'string here' found in each file). However, it does not display the total for ALL files searched. Here is how you can get it:

grep -c -r 'string' file > out && \
    awk -F : '{total += $2} END { print "Total:", total }' out

The list will be in out and the total will be sent to STDOUT.

Here is the output on the Python2.5.4 directory tree:

grep -c -r 'import' Python-2.5.4/ > out && \
    awk -F : '{total += $2} END { print "Total:", total }' out
Total: 11500

$ head out
Python-2.5.4/Python/import.c:155
Python-2.5.4/Python/thread.o:0
Python-2.5.4/Python/pyarena.c:0
Python-2.5.4/Python/getargs.c:0
Python-2.5.4/Python/thread_solaris.h:0
Python-2.5.4/Python/dup2.c:0
Python-2.5.4/Python/getplatform.c:0
Python-2.5.4/Python/frozenmain.c:0
Python-2.5.4/Python/pyfpe.c:0
Python-2.5.4/Python/getmtime.c:0

If you just want to get lines with occurrences of 'string', change to this:

grep -c -r 'import' Python-2.5.4/ | \
    awk -F : '{total += $2; print $1, $2} END { print "Total:", total }'

That will output:

[... snipped]
Python-2.5.4/Lib/dis.py 4
Python-2.5.4/Lib/mhlib.py 10
Python-2.5.4/Lib/decimal.py 8
Python-2.5.4/Lib/new.py 6
Python-2.5.4/Lib/stringold.py 3
Total: 11500

You can change how the files ($1) and the count per file ($2) is printed.

Nick Presta
  • 28,134
  • 6
  • 57
  • 76
2

Some solution with AWK:

grep -r "string here" * | awk 'END { print NR } 1'

Next one is total count, number of files, and number of matches for each, displaying the first match of each one (to display all matches, change the condition to ++f[$1]):

grep -r "string here" * | 
    awk -F: 'END { print "\nmatches: ", NR, "files: ", length(f); 
                   for (i in f) print i, f[i] } !f[$1]++'

Output for the first solution (searching within a directory for "boost::". I manually cut some too long lines so they fit horizontally):

list_inserter.hpp:            return range( boost::begin(r), boost::end(r) );
list_of.hpp:            ::boost::is_array<T>,
list_of.hpp:            ::boost::decay<const T>,
list_of.hpp:            ::boost::decay<T> >::type type;
list_of.hpp:        return ::boost::iterator_range_detail::equal( l, r );
list_of.hpp:        return ::boost::iterator_range_detail::less_than( l, r );
list_of.hpp:        return ::boost::iterator_range_detail::less_than( l, r );
list_of.hpp:        return Os << ::boost::make_iterator_range( r.begin(), r.end() );
list_of.hpp:            return range( boost::begin(r), boost::end(r) );
list_of.hpp:            return range( boost::begin(r), boost::end(r) );
list_of.hpp:            return range( boost::begin(r), boost::end(r) );
ptr_list_of.hpp:                          BOOST_DEDUCED_TYPENAME boost::ptr_...
ptr_list_of.hpp:        typedef boost::ptr_vector<T>       impl_type;
13

Output for the second one

list_inserter.hpp:            return range( boost::begin(r), boost::end(r) );
list_of.hpp:            ::boost::is_array<T>,
ptr_list_of.hpp:                          BOOST_DEDUCED_TYPENAME boost::ptr_...

matches:  13 files:  3
ptr_list_of.hpp 2
list_of.hpp 10
list_inserter.hpp 1

Colors in the result are nice (--color=always for grep), but they break when piped through awk here. So better don't enable them then unless you want to have all your terminal colored afterwards :) Cheers!

Johannes Schaub - litb
  • 496,577
  • 130
  • 894
  • 1,212
1
grep -rc "my string" ./ | grep :[1-9] >> file_name_by_count.txt

Works like a charm.

KrNel
  • 384
  • 1
  • 5
  • 16
1

I would try a combination of find and grep.

find . | xargs grep -c "string here"

Anyway, grep -c -r "string here" * works for me (Mac OS X).

mouviciel
  • 66,855
  • 13
  • 106
  • 140
0

"How can I total count of finds?"

grep -roh "string here" | grep -v "^Binary.*matches$" | grep -c ^

  • 1
    Welcome to Stack Overflow! While this code may solve the question, [including an explanation](//meta.stackexchange.com/q/114762) of how and why this solves the problem would really help to improve the quality of your post, and probably result in more up-votes. Remember that you are answering the question for readers in the future, not just the person asking now. Please [edit] your answer to add explanations and give an indication of what limitations and assumptions apply. – Yunnosch Jan 23 '23 at 20:49
  • Hi. Don't worry. I'm not staying. So instead of allowing people to do their research - thus learn and improve, you're asking me to do what exactly? Remember that you are living in the same society you're modeling :). <> Dear people waiting for me to explain it to you: read the manual. Say "Aha!" Understanding won't bite you. – the wind Jan 24 '23 at 09:42
0

To output only file names with matches, use:

grep -r -l "your string here" .

It will output one line with the filename for each file which matches the expression searched for.

ASk
  • 4,157
  • 1
  • 18
  • 15