23

What would be a way to find largest commits (i.e. commits introducing most changes, for instance counted as the number of added/removed lines) in a git repo?

Note that I really want largest commits, not largest files, so git find fat commit is not helpful here.

Community
  • 1
  • 1
akoprowski
  • 3,297
  • 1
  • 18
  • 26

2 Answers2

14

you can use git log --format=format:"%H" --shortstat. It will output something like

b90c0895b90eb3a6d1528465f3b5d96a575dbda2
 2 files changed, 32 insertions(+), 7 deletions(-)

642b5e1910e1c2134c278b97752dd73b601e8ddb
 11 files changed, 835 insertions(+), 504 deletions(-)

// other commits skipped

Seems like an easily parsed text.

max
  • 33,369
  • 7
  • 73
  • 84
  • I get an error when running the git log --format=format:"%H" --shortstat command. fatal: ambiguous argument '%H': unknown revision or path not in the working tree. – DucRP Jan 09 '17 at 15:02
  • How do I sort by number of insertions/deletions? – Boris Verkhovskiy Jan 13 '20 at 19:22
  • 1
    @Boris and anyone else who wants to parse this. A quick and dirty way for me was to use some regex to find anything with over 9,999 insertions and 9,999 deletions. While in the git log view, type `/` to begin searching (just like vim). Then type this regex `\d{5,} insertions\(\+\) \d{5,}`. This means search for 5 or more digits, then the string `insertions(+) `, followed by another 5 or more digits. Change the `5`s up or down to find more or less. If you find no matches, just start with the smallest regex `\d` to make sure it's working, then start adding the curlys, etc. – Jason Jun 23 '20 at 12:34
  • @Jason Thanks for that regex. I had to add a comma in there since the git output text must have changed a little. – tim-phillips May 19 '22 at 20:35
6

For anyone wanting to get a simple list of largest to smallest commits (by the amount of changes made in a commit) I took @max's answer and parsed and ordered the result.

git log --format=format:"%H" --shortstat | perl -00 -ne 'my ($hash, $filesChanged, $insertions, $deletions) = $_ =~ /(?:[0-9a-f]+\n)*([0-9a-f]+)\n(?: (\d+) files? changed,)?(?: (\d+) insertions?...,?)?(?: (\d+) deletions?...)?/sg; print $hash, "\t", $insertions + $deletions, "\n"' | sort -k 2 -nr

That takes all the commits, adds together the number of insertions and deletions for each, and then orders that list from highest to lowest. To get just the top ten largest commits add | head -10 to the end.

Callum Gare
  • 131
  • 2
  • 6