31

I want to count the number of matches there is on one single line (or all lines as there always will be only one line).

I want to count not just one match per line as in

echo "123 123 123" | grep -c -E "123" # Result: 1

Better example:

echo "1 1 2 2 2 5" | grep -c -E '([^ ])( \1){1}' # Result: 1, expected: 2 or 3
Tyilo
  • 28,998
  • 40
  • 113
  • 198

5 Answers5

53

You could use grep -o then pipe through wc -l:

$ echo "123 123 123" | grep -o 123 | wc -l
3
Simon Whitaker
  • 20,506
  • 4
  • 62
  • 79
  • 1
    My version of grep doesn't know what `-o` is :( – manojlds May 30 '11 at 22:52
  • 15
    You need to ask Father Christmas for a new grep this year. :) – Simon Whitaker May 30 '11 at 22:54
  • @manojlds, do you have `egrep`? Same thing would work w/ `egrep` – Mike Pennington May 30 '11 at 22:54
  • @Mike Pennington - thanks, `egrep` says the same. I am on Windows now, so i think it's expected. – manojlds May 30 '11 at 22:57
  • @Tylio - that's not surprising, look at your regex. It's asking for 0 or more instances of anything other than a space, followed by a space, followed by the first thing you matched again. Note: **0 or more**. There are indeed five such matches in your string (assuming you don't rewind after each match). They are: 1) `"1 1"` (bytes 1-3), 2) `" "` (i.e. zero instances of something that isn't a space, followed by a space, followed by the same zero instances again - byte 4), 3) `"2 2"` (bytes 5-7), 4) `" "` (byte 8) and finally 5) `" "` (byte 10). Phew! – Simon Whitaker May 30 '11 at 23:00
  • (Run it without the pipe to `wc -l` at the end and you'll see them.) – Simon Whitaker May 30 '11 at 23:03
  • What still results in 2? Your grep -E example, or my answer to your question? (I get 5 from the former, 3 from the latter.) – Simon Whitaker May 30 '11 at 23:06
  • This: `echo "1 1 2 2 2 5" | grep -o -E '([^ ])( \1){1}' | wc -l` – Tyilo May 30 '11 at 23:08
  • As far as I know, `grep -o` won't rewind on finding a match, so this will only match "1 1" (bytes 1-3) and "2 2" (bytes 5-7). It won't match bytes 7-9 ("2 2") because by the time it comes to consider bytes 8 onwards it's already consumed bytes 1-7 in the previous two matches. – Simon Whitaker May 30 '11 at 23:11
  • Why does echo "1 1 2 2 2 5" | grep -o 2 | wc -l which gives 3 not meet your requirement? – grok12 May 31 '11 at 01:15
1

Maybe below:

echo "123 123 123" | sed "s/123 /123\n/g" | wc -l

( maybe ugly, but my bash fu is not that great )

manojlds
  • 290,304
  • 63
  • 469
  • 417
1

Maybe you should convert spaces to newlines first:

$ echo "1 1 2 2 2 5" | tr ' ' $'\n' | grep -c 2
3
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
0

Why not use awk? You could use awk '{print gsub(your_regex,"&")}' to print the number of matches on each line, or awk '{c+=gsub(your_regex,"&")}END{print c}' to print the total number of matches. Note that relative speed may vary depending on which awk implementation is used, and which input is given.

jarno
  • 787
  • 10
  • 21
  • Another way by gawk is `gawk -v FPAT=your_regex '{print NF}'` or `gawk -v FPAT=your_regex '{c+=NF}END{print c}'`, respectively. – jarno Sep 03 '15 at 17:57
0

This might work for you:

sed -n -e ':a' -e 's/123//p' -e 'ta' file | sed -n '$='

GNU sed could be written:

sed -n ':;s/123//p;t' file | sed -n '$='
potong
  • 55,640
  • 6
  • 51
  • 83
  • The first script does't work by GNU sed 4.2.2: "sed: can't find label for jump to `a'". It seems to work better, if you replace `:ta` by `:a`. The scripts seems to require newline in the end of intput. Besides, the script outputs nothing, if no matches are found. Test: `printf 123 | sed -n ':;s/123//p;t' | sed -n '$='` outputs nothing. – jarno Sep 04 '15 at 18:09