Count all occurrences of a string in lots of files with grep

Question

I have a bunch of log files. I need to find out how many times a string occurs in all files.

grep -c string *

returns

...
file1:1
file2:0
file3:0
...

Using a pipe I was able to get only files that have one or more occurrences:

grep -c string * | grep -v :0

...
file4:5
file5:1
file6:2
...

How can I get only the combined count? (If it returns file4:5, file5:1, file6:2, I want to get back 8.)

Can you tell me what the grep -v :0 does ? . I know it counts for files having occurrences greater than 0. What does the -v option and :0 mean ?. Kindly let me know. — Gautham Honnavara, May 09 '17 at 17:57
@GauthamHonnavara grep :0 looks for line that match the string :0. -v is an option to invert that search so instead using grep -v :0 means find all line that don't contain :0 so a line with file4:5 and file27:193 all would pass through since they don't contain :0 — penguin359, May 16 '17 at 17:56
You can select multiple files using space. `grep file1 file2 --options` — Dnyaneshwar Harer, Sep 27 '19 at 09:11

score 330 · Answer 1 · answered Jul 14 '10 at 19:31

330

This works for multiple occurrences per line:

grep -o string * | wc -l

answered Jul 14 '10 at 19:31

Jeremy Lavine

3,406
1
15
4

2

This also works: `grep -o string * --exclude-dir=some/dir/one/ --exclude-dir=some/dir/two | wc -l`. – a coder Nov 05 '14 at 14:16
3

`grep -ioR string * | wc -l` is what I use to do a case-insensitive, recursive, matching-only search – LeonardChallis May 28 '15 at 08:55
3

This one shows the relevant files and then the total count of matches: `grep -rc test . | awk -F: '$NF > 0 {x+=$NF; $NF=""; print} END{print "Total:",x}'` – Yaron Sep 06 '17 at 11:40
Beware of limitations to grep: https://superuser.com/questions/1703029/is-there-a-limit-for-a-line-length-for-grep-command-to-process-correctly – duplex143 Aug 24 '23 at 16:50

score 303 · Accepted Answer · answered Dec 16 '08 at 12:17

303

cat * | grep -c string

answered Dec 16 '08 at 12:17

Bombe

81,643
20
123
127

10

This has the same limitation that it counts multiple occurrences on one line only once. I am guessing that this behavior is OK in this case, though. – Michael Haren Dec 16 '08 at 12:22
@Michael Haren Yes, there could be only one occurrence of string in a line. – Željko Filipin Dec 16 '08 at 12:25
2

I'd rather do `grep -c string<*` So just replacing the space with a less than. – JamesM-SiteGen Jan 04 '12 at 02:08
54

Does not address multiple occurrences on a line – bluesman May 09 '12 at 16:14
2

This doesn't work if you want to search in subdirectories too, whereas `grep -o` and `wc -l` does. cat is quicker in cases like the original question though. – Leagsaidh Gordon Jan 03 '13 at 15:37
Somewhat tangential, but it's what I came here hoping for & might help others: this won't work for `git grep` because it doesn't have `-o`, but `git grep | grep -c ` does. Like the accepted answer, inaccurate for the case where there's multiple occurrences on one line. `git grep | grep -o | wc -l` will cover that case. – eggsyntax Apr 13 '17 at 20:19
Note that the OP in the question also does not count multiple occurrences per line – information_interchange Jul 12 '18 at 16:24

score 28 · Answer 3 · answered Feb 27 '13 at 07:40

28

grep -oh string * | wc -w

will count multiple occurrences in a line

answered Feb 27 '13 at 07:40

Kaofu

297
3
3

30

`grep -oh "... my that curry was strong" * >> wc` :) – icc97 Mar 23 '16 at 16:03
@icc97 did you mean to pipe to wc or to cwc? (curse word count) – Matiaan May 29 '23 at 07:38

score 26 · Answer 4 · answered Dec 16 '08 at 12:15

26

Instead of using -c, just pipe it to wc -l.

grep string * | wc -l

This will list each occurrence on a single line and then count the number of lines.

This will miss instances where the string occurs 2+ times on one line, though.

answered Dec 16 '08 at 12:15

Michael Haren

105,752
40
168
205

2

Piping to "wc -l" works also nicely together with "grep -r 'test' ." which scans recursively all files for the string 'test' in all directories below the current one. – Stephan Kristyn Dec 13 '11 at 15:07

score 18 · Answer 5 · answered Dec 16 '08 at 12:18

18

cat * | grep -c string

One of the rare useful applications of cat.

answered Dec 16 '08 at 12:18

Joachim Sauer

302,674
57
556
614

azmeuk · Answer 6 · 2018-05-01T13:07:44.440

13

You can add -R to search recursively (and avoid to use cat) and -I to ignore binary files.

grep -RIc string .

edited May 01 '18 at 13:07

answered Dec 12 '13 at 12:18

azmeuk

4,026
3
37
64

Andriy Makukha · Answer 7 · 2018-10-07T10:07:52.300

If you want number of occurrences per file (example for string "tcp"):

grep -RIci "tcp" . | awk -v FS=":" -v OFS="\t" '$2>0 { print $2, $1 }' | sort -hr

Example output:

53  ./HTTPClient/src/HTTPClient.cpp
21  ./WiFi/src/WiFiSTA.cpp
19  ./WiFi/src/ETH.cpp
13  ./WiFi/src/WiFiAP.cpp
4   ./WiFi/src/WiFiClient.cpp
4   ./HTTPClient/src/HTTPClient.h
3   ./WiFi/src/WiFiGeneric.cpp
2   ./WiFi/examples/WiFiClientBasic/WiFiClientBasic.ino
2   ./WiFiClientSecure/src/ssl_client.cpp
1   ./WiFi/src/WiFiServer.cpp

Explanation:

grep -RIci NEEDLE . - looks for string NEEDLE recursively from current directory (following symlinks), ignoring binaries, counting number of occurrences, ignoring case
awk ... - this command ignores files with zero occurrences and formats lines
sort -hr - sorts lines in reverse order by numbers in first column

Of course, it works with other grep commands with option -c (count) as well. For example:

grep -c "tcp" *.txt | awk -v FS=":" -v OFS="\t" '$2>0 { print $2, $1 }' | sort -hr

Awesome! Worked like a charm. Save days time. Thank you so much. — sreejagaths, Jul 20 '20 at 18:46

score 12 · Answer 8 · edited Jun 18 '17 at 08:26

12

Something different than all the previous answers:

perl -lne '$count++ for m/<pattern>/g;END{print $count}' *

edited Jun 18 '17 at 08:26

Peter Mortensen

30,738
21
105
131

answered Feb 27 '13 at 08:00

Vijay

65,327
90
227
319

nice to see an approach not using grep, esp as my grep (on windows) doesn't support the -o option. – David Roussel Mar 12 '13 at 15:14

score 12 · Answer 9 · edited Jun 18 '17 at 08:25

12

Obligatory AWK solution:

grep -c string * | awk 'BEGIN{FS=":"}{x+=$2}END{print x}'

Take care if your file names include ":" though.

edited Jun 18 '17 at 08:25

Peter Mortensen

30,738
21
105
131

answered Sep 29 '11 at 12:26

mumrah

361
4
3

score 7 · Answer 10 · edited Jun 18 '17 at 08:26

7

The AWK solution which also handles file names including colons:

grep -c string * | sed -r 's/^.*://' | awk 'BEGIN{}{x+=$1}END{print x}'

Keep in mind that this method still does not find multiple occurrences of string on the same line.

edited Jun 18 '17 at 08:26

Peter Mortensen

30,738
21
105
131

answered Jan 25 '13 at 20:07

Kreuvf

273
3
6

score 5 · Answer 11 · edited Jun 18 '17 at 08:28

5

You can use a simple grep to capture the number of occurrences effectively. I will use the -i option to make sure STRING/StrING/string get captured properly.

Command line that gives the files' name:

grep -oci string * | grep -v :0

Command line that removes the file names and prints 0 if there is a file without occurrences:

grep -ochi string *

edited Jun 18 '17 at 08:28

Peter Mortensen

30,738
21
105
131

answered Jun 12 '15 at 13:19

Mitul Patel

319
3
5

1

Could you please elaborate more your answer adding a little more description about the solution you provide? – abarisone Jun 12 '15 at 13:27

score 5 · Answer 12 · answered Jul 17 '17 at 16:25

5

short recursive variant:

find . -type f -exec cat {} + | grep -c 'string'

answered Jul 17 '17 at 16:25

Dmitry Tarashkevich

91
1
4

1

Thank you! Only your solution worked for me (summed the matches of all the files). – Nestor Aug 25 '19 at 22:01

score 2 · Answer 13 · edited Jun 18 '17 at 08:27

2

Here is a faster-than-grep AWK alternative way of doing this, which handles multiple matches of <url> per line, within a collection of XML files in a directory:

awk '/<url>/{m=gsub("<url>","");total+=m}END{print total}' some_directory/*.xml

This works well in cases where some XML files don't have line breaks.

edited Jun 18 '17 at 08:27

Peter Mortensen

30,738
21
105
131

answered Jun 11 '14 at 19:02

Excalibur

3,258
2
24
32

Quantic · Answer 14 · 2015-12-15T19:48:38.333

Grep only solution which I tested with grep for windows:

grep -ro "pattern to find in files" "Directory to recursively search" | grep -c "pattern to find in files"

This solution will count all occurrences even if there are multiple on one line. -r recursively searches the directory, -o will "show only the part of a line matching PATTERN" -- this is what splits up multiple occurences on a single line and makes grep print each match on a new line; then pipe those newline-separated-results back into grep with -c to count the number of occurrences using the same pattern.

score 0 · Answer 15 · edited Oct 29 '14 at 17:21

0

Another oneliner using basic command line functions handling multiple occurences per line.

 cat * |sed s/string/\\\nstring\ /g |grep string |wc -l

edited Oct 29 '14 at 17:21

gloomy.penguin

5,833
6
33
59

answered Jan 23 '14 at 16:26

NTwoO

75
6

score 0 · Answer 16 · edited Sep 14 '21 at 01:11

0

awk -v RS='' -v FPAT='fast' '{print NF,FILENAME}' <file1..N>

Take a string, make it a line look for instance of fast and then print the number of fields with the filename.

edited Sep 14 '21 at 01:11

ChrisGPT was on strike

127,765
105
273
257

answered Sep 13 '21 at 20:47

Alan Tegel

1

Count all occurrences of a string in lots of files with grep

16 Answers16

Linked

Related