command to count occurrences of word in entire file

Question

I am trying to count the occurrences of a word in a file.

If word occurs multiple times in a line, I will count is a 1.

Following command will give me the output but will fail if line has multiple occurrences of word

grep -c "word" filename.txt

Is there any one liner?

possible duplicate [Calculate Word occurrences from file in bash](http://stackoverflow.com/questions/11850823/calculate-word-occurrences-from-file-in-bash) — jbh, Feb 06 '14 at 12:55
Does "I will count is a 1." mean "I will count it as 1" or "I will count each as 1" ? — Guntram Blohm, Feb 06 '14 at 12:58

score 20 · Answer 1 · answered Feb 06 '14 at 12:55

You can use grep -o to show the exact matches and then count them:

grep -o "word" filename.txt | wc -l

Test

$ cat a
hello hello how are you
hello i am fine
but
this is another hello

$ grep -c "hello" a    # Normal `grep -c` fails
3

$ grep -o "hello" a 
hello
hello
hello
hello
$ grep -o "hello" a | wc -l   # grep -o solves it!
4

score 3 · Answer 2 · answered Feb 07 '14 at 04:00

3

Set RS in awk for a shorter one.

awk 'END{print NR-1}' RS="word" file

answered Feb 07 '14 at 04:00

BMW

42,880
12
99
116

score 2 · Answer 3 · answered Feb 06 '14 at 13:53

2

GNU awk allows it to be done in single command with use of multiple piped commands:

awk -v w="word" '$1==w{n++} END{print n}' RS=' |\n' file

answered Feb 06 '14 at 13:53

anubhava

761,203
64
569
643

score 1 · Answer 4 · answered Feb 06 '14 at 12:57

1

cat file | cut -d ' ' | grep -c word

This assumes that all words in the file have spaces between the words. If there's punctuation concatenating the word to itself, or otherwise no spaces on a single line between the word and itself, they'll count as one.

answered Feb 06 '14 at 12:57

atk

9,244
3
32
32

how about `tr " " "\n"< file |grep -c "word"` – BMW Feb 07 '14 at 03:54
I think `grep -o '[^ \t\n,.]\+'`would let you specify word separators, then use `wc -l` – coya Apr 22 '16 at 16:08
Sorry, missed the -P option in the regexp. See: http://stackoverflow.com/questions/1825552/grep-a-tab-in-unix for more info – coya Apr 22 '16 at 16:27

score -1 · Answer 5 · answered Feb 06 '14 at 12:54

-1

grep word filename.txt | wc -l

grep prints the lines that match, then wc -l prints the number of lines matched

answered Feb 06 '14 at 12:54

Michael

979
6
13

2

It does not count reoccurrences of words in the same line. This counts how many lines have the word in them – jbh Feb 06 '14 at 12:57
1

@GuntramBlohm No it does not. Given my sample file, it would return 3 instead of 4. – fedorqui Feb 06 '14 at 12:57
"I will count is a 1." would mean, to me, he wants multiple words on the same line count only once. – Guntram Blohm Feb 06 '14 at 12:59
1

However, read the "Following command will give me the output but will fail if line has multiple occurrences of word." I think he probably meant to say "If a word occurs multiple times in a line, it will count it as 1" – jbh Feb 06 '14 at 13:01
1

yes, he meant that "up to now, if multiple occurence on one line it counts it as one" and therefor he is looking for a better solution (one that counts occurence of the word, not of lines containing the word) (hence the question. Otherwise, his "grep -c" would already be the answer). – Olivier Dulac Feb 06 '14 at 13:19

command to count occurrences of word in entire file

5 Answers5

Test

Linked