2

I found a lot of questions on how many lines were added and removed on a given commit...many answers too, with the best one, in my opinion: https://gist.github.com/KOGI/8700277

However no-one was interested on the total number of lines of the file(s) at a given point - commit. Is this possible or am I looking the wrong way?

Ideally I would expect:

  • number of lines added on file X
  • number of lines removed on file X
  • total number of lines of the file X

I need this information to get some metrics and conduct statistical analysis on the way a product changes over time.

Thanks!!

Burhan Ali
  • 2,258
  • 1
  • 28
  • 38
Sokratis
  • 85
  • 2
  • 7

2 Answers2

3

For Statistical Analysis you can look into below steps -

You can use git log and some shell-fu:

git log --shortstat --author "Aviv Ben-Yosef" --since "2 weeks ago" --until "1 week ago" 
    | grep "files\? changed" 
    | awk '{files+=$1; inserted+=$4; deleted+=$6} END 
           {print "files changed", files, "lines inserted:", inserted, "lines deleted:", deleted}'

Explanation: git log --shortstat displays a short statistic about each commit, which, among other things, shows the number of changed files, inserted and deleted lines. We can then filter it for a specific committer (--author "Your Name") and a time range (--since "2 weeks ago" --until "1 week ago").

Now, in order to actually sum up the stats instead of seeing the entry per commit, we do some shell scripting to do it. First, we use grep to filter only the lines with the diffs. These lines look like this:

 8 files changed, 169 insertions(+), 81 deletions(-)

or this:

 1 file changed, 4 insertions(+), 4 deletions(-)

We then sum these using awk: for each line we add the files changed (1st word), inserted lines (4th word) and deleted lines (6th word) and then print them after summing it all up.

The output of the following command should be reasonably easy to send to script to add up the totals:

git log --author="<authorname>" --oneline --shortstat

This gives stats for all commits on the current HEAD. If you want to add up stats in other branches you will have to supply them as arguments to git log.

For passing to a script, removing even the "oneline" format can be done with an empty log format, and as commented by Jakub Narębski, --numstat is another alternative. It generates per-file rather than per-line statistics but is even easier to parse.

git log --author="<authorname>" --pretty=tformat: --numstat

We have an alternate too -

You can generate stats using Gitstats. It has an 'Authors' section which includes number of lines add/removed by the top 20 authors (top 20 by commit count).

Edit: There's also Git: Blame Statistics

Community
  • 1
  • 1
SantanuMajumdar
  • 886
  • 1
  • 5
  • 20
  • This still doesn't answer the question of getting the total number of lines of each file, or it does?? – Sokratis Apr 23 '15 at 16:37
  • From what I understand this provides: number of commits, lines added and lines removed. If I am mistaken please let me know where exactly does it specify the total number of lines of a single file? – Sokratis Apr 23 '15 at 17:29
  • Just updated the answer. You could use --numstat "It generates per-file rather than per-line statistics but is even easier to parse". Let me know if this helps. – SantanuMajumdar Apr 23 '15 at 17:39
  • Probably it's just me not stating the question properly. Your last command `git log --author="" --pretty=tformat: --numstat` works like a charm, but still it doesn't provide the total number of lines of the file (I'm interested on getting the lines of code of the entire file, not just added/removed). ;) – Sokratis Apr 23 '15 at 18:29
0

If you are on Linux or got Cygwin installed you could git checkout myfile and then run wc -l myfile.

FlyingFoX
  • 3,379
  • 3
  • 32
  • 49
  • The idea is to get all the information at once and not to have to checkout for every single commit. Thanks though!! – Sokratis Apr 23 '15 at 16:33