138

How can I show files in Git which change most often?

Zombo
  • 1
  • 62
  • 391
  • 407
Sebastian
  • 2,618
  • 3
  • 25
  • 32

10 Answers10

204

You could do something like the following:

git log --pretty=format: --name-only | sort | uniq -c | sort -rg | head -10

The log just outputs the names of the files that have been changed in each commit, while the rest of it just sorts and outputs the top 10 most frequently appearing filenames.

Zombo
  • 1
  • 62
  • 391
  • 407
Mark Longair
  • 446,582
  • 72
  • 411
  • 327
  • Can you please tell me if this is based off the current branch or if it is for the whole repository? What about branches not yet merged? – Karthick S Mar 05 '13 at 15:29
  • @KarthickS: that's only for commits in the current branch - you could add `--branches` to the `git log` if you want to include commits on any of your local branches. – Mark Longair Mar 06 '13 at 12:55
  • 3
    Nice. Also, I found it also reports file that were deleted long time ago. Quick fix was to limit time, e.g: --since="last year" – FractalSpace Apr 05 '13 at 21:04
  • I'm seeing a blank entry as the first result, making the command look like: `git log --pretty=format: --name-only | grep -v '^$' ...` fixed it for me – mjgpy3 Mar 25 '15 at 14:42
  • To expand on this, here is a command to show the most active directories across branches: "git log --branches --pretty=format: --name-only | tr '\n' '\0' | xargs -0 -n1 dirname | sort | uniq -c | sort -rg | head -10" – Zack Morris Oct 22 '15 at 01:40
  • 5
    also helpful is using `--since "1 month ago"` or other options to narrow down the time window –  Oct 25 '18 at 17:19
  • on windows the sort pipe did not work properly with that flag. I changed it to this and it worked properly: `git log --pretty=format: --name-only | sort | uniq -c | sort /R | head -100` – Stian Standahl Jun 05 '19 at 06:35
  • Is there a way to modify this for certain files? Let's say I want to know the X most modified `.java` files. Also, when I put `-10` at the end I only get 8 items. If I change the number, it appears to always be off by two. Any idea why? – AdamMc331 Jun 17 '19 at 16:21
  • 9
    Found part of my answer: `git log --pretty=format: --since="1 year ago" --name-only -- "*.java" | sort | uniq -c | sort -rg | head -10` – AdamMc331 Jun 17 '19 at 16:33
57

you can use the git effort (from the git-extras package) command which shows statistics about how many commits per files (by commits and active days).

EDIT: git effort is just a bash script you can find here and adapt to your needs if you need something more special.

Asenar
  • 6,732
  • 3
  • 36
  • 49
  • The output will be 2-parted, first you get the unsorted results, then the sorted (and coloured) results. Right? – Andy Dec 09 '16 at 15:52
  • @Andy it seems to (and `git help effort` has no information about it :/). I assume the first set of result are ordered by filename, and the second one by number of commit per file. The man page also mention https://github.com/tj/git-extras/issues to report issues – Asenar Dec 20 '16 at 08:37
  • is this similar to this? https://blog.riff.org/2015_10_30_git_tip_of_the_day_show_the_hottest_files_in_a_repo –  Oct 25 '18 at 17:13
18

I noticed that both Mark’s and sehe’s answers do not --follow the files, that is to say they stop once they reach a file rename. This script will be much slower, but will work for that purpose.

git ls-files |
while read aa
do
  printf . >&2
  set $(git log --follow --oneline "$aa" | wc)
  printf '%s\t%s\n' $1 "$aa"
done > bb
echo
sort -nr bb
rm bb
Zombo
  • 1
  • 62
  • 391
  • 407
  • 1
    To expand on this I created https://gist.github.com/caleb15/da591031936f35d80e14a42ca7ba4350 It aggregates changes by folder, specifically by each folder in the `roles` directory for my case but is easily modified to fit your use case. – Almenon Nov 22 '19 at 19:26
  • 1
    If one is primarily interested in -recent- hotspots, they can add the `--since` argument to `git log`. I used `git log --follow --since=2022 --oneli...` for example. – AlexMA Aug 03 '22 at 16:45
10

Old question, but I think still a very useful question. Here is a working example in straight powershell. This will get the top 10 most changed files in your repo with respect to the branch you are on.

git log --pretty=format: --name-only | Where-Object { ![string]::IsNullOrEmpty($_) } | Sort-Object | Group-Object  | Sort-Object -Property Count -Descending | Select-Object -Property Count, Name -First 10
Omar Rodriguez
  • 429
  • 5
  • 13
5

A simple node tool that has more flexible filters is git-heatmap. Run git-heatmap in the folder of your project, it will iterate the last 1000 commits and generate a heat map of the most changed files. You can check git-heatmap -h for more filters.

enter image description here

Hainan Zhao
  • 1,962
  • 19
  • 19
4

This is a windows version

git log --pretty=format: --name-only  > allfiles.csv

then open in excel

A1: FileName
A2: isVisibleFilename  >> =IFERROR(IF(C2>0,TRUE,FALSE),FALSE)
A3: DotLocation >> =FIND("@",SUBSTITUTE(A2,".","@",(LEN(A2)-LEN(SUBSTITUTE(A2,".","")))/LEN(".")))
A4: HasExt       >> =C2>1
A5: TYPE        >> =IF(D2=TRUE,MID(A2,C2+1,18),"")

create pivot table

values: Type
  Filter: isFilename = true
  Rows : Type
  Sub : FileName

click [Count Of TYPE] -> Sort -> Sort Largest To Smallest
Mickey Perlstein
  • 2,508
  • 2
  • 30
  • 37
3

For powershell, assuming you got git bash installed

git log --pretty=format: --name-only | sort | uniq -c | sort -Descending | select -First 10
hyeomans
  • 4,522
  • 5
  • 24
  • 29
2
git whatchanged --all | \grep "\.\.\." | cut -d' ' -f5- | cut -f2- | sort | uniq -c | sort

If you only want to see your files add --author to git whatchanged --author=name --all.

Andrew
  • 3,733
  • 1
  • 35
  • 36
0

We can also find out files changed between two commits or branches, for e.g.

git log  --pretty=format: --name-only <source_branch>...<target_branch> | sort | uniq -c | sort -rg | head -50 
Pawan Maheshwari
  • 15,088
  • 1
  • 48
  • 50
0

This is probably obvious, but, the queries provided will show all files, but, perhaps you're not interested in knowing that your configuration or project files are the most updated. A simple grep will isolate to your code files, for example:

git log --pretty=format: --name-only | grep .cs$ | sort | uniq -c | sort -rg | head -20