2

I can find all the lines containing "my string" pattern in a single branch of my git repository by following command

git grep "my string" my_branch

Say, it results in following

my_branch: file1:file1 What is "my string"?
my_branch: file2:file2 Hello, "my string" is just "my string"!

We see 3 occurrences in two lines of two files. I can count these lines via

git grep "my string" my_branch | wc -l

It will result in

2

The question is how to get the exact number of string occurrences through all the lines through all the files in a given branch? Is it possible to run some command or script that will give me 3 in my example, not 2?

RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
dhilt
  • 18,707
  • 8
  • 70
  • 85

3 Answers3

4

The -o option was introduced in 2.18. In previous versions, you can use git show and combine it with a standard grep call:

$ git show my_branch | grep -o 'my string' | wc -l
Paolo
  • 21,270
  • 6
  • 38
  • 69
  • It's weird, but it gives me incorrect result, which is dramatically less than needed. I updated git locally and was able to run `git grep -o`, and that gave me correct result. Also, `git show | awk` approach gives the same incorrect result as the `git show | grep` one. What can be wrong on my side? – dhilt Jan 18 '20 at 19:13
1

Why grep is NOT giving correct results is,since it is looking for string in each line and if a line is having more than 1 occurrence of string then also it is counting it as 1 occurrence.

Example of grep not counting multiple occurrences of string on same line:

Let's say we have following Input_file:

cat Input_file
test my_string
la bla bla
my_string
bla bla

Now when we run grep command it gives as follows:

grep "my_string" Input_file | wc -l
2

Now lets put multiple occurrences of a string in a single line:

cat Input_file
test my_string
la bla bla
my_string my_string
bla bla

grep "my_string" Input_file | wc -l
2


So if permitted then you could try awk where you need not to use 2 programs(grep + wc), also for git command taken reference from @UnbearableLightness's answer here.

git show my_branch |awk '{sum+=gsub(/my string/,"&")} END{print sum}'
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
1

You can use -o option in grep for your requirement and pass it to wc -l for the count:

Inside man grep:

-o, --only-matching
              Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.

This should work for you:

git grep -o "my string" my_branch | wc -l

Please note that to use -o option, the git version must be 2.18 or higher.

dhilt
  • 18,707
  • 8
  • 70
  • 85
User123
  • 1,498
  • 2
  • 12
  • 26
  • Thanks for the answer, but `git grep -o ...` throws the following warning: `error: unknown switch `o'` – dhilt Jan 18 '20 at 18:21
  • @dhilt: Most probably, it looks like the git version you are using is bit old and is not the recent one? – User123 Jan 18 '20 at 18:37
  • Yes, I had 2.16. Updating git (to 2.25 in my case) I was able to run `git grep -o ...` and I finally got right result – dhilt Jan 18 '20 at 19:00
  • @dhilt: Glad that it worked for you now!!! you can accept and upvote any of the suitable answers here,cheers :-) – User123 Jan 18 '20 at 19:12
  • 1
    I updated your answer by adding 2.18 git version requirement and accepted it as this is the only way I got the correct result – dhilt Jan 18 '20 at 21:37