1

This is the size of my .git folder:

$ du -sh .git
321M    .git

There are a couple of huge files accidentally commited there in the past.

Through some bizarre magic I found the cuplrits and their blobs:

$ git rev-list --objects --all |   git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' |   sed -n 's/^blob //p' |   sort --numeric-sort --key=2 |   cut -c 1-12,41- |   $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest


....
5c48fadaee41  492MiB android/java_pid3568.hprof
...

And the author of these commits:

git log --all --full-history -- "android/java_pid3568.hprof"

commit cac160ab39b9c70f3d05f8194be8c3c0657161ad
Author: Incore <xxxx@mail.ru>
Date:   Thu Dec 10 18:34:34 2020 +0300

commit fc0cfc2f32ddd0f927169ad0514213f79795dd63
Author: Incore <xxxx@mail.ru>
Date:   Tue Dec 8 21:26:17 2020 +0300


After this, I tried to rewrite my history removing this file:

git filter-branch --force --index-filter   'git rm --cached --ignore-unmatch android/java_pid3568.hprof'   --prune-empty --tag-name-filter cat -- --all



Rewrite fc0cfc2f32ddd0f927169ad0514213f79795dd63 (19/376) (1 seconds passed, remaining 18 predicted)    rm 'android/java_pid3568.hprof'
Rewrite 540ced9097b7090bfd68279515178bf060db93ee (19/376) (1 seconds passed, remaining 18 predicted)    rm 'android/java_pid3568.hprof'

Now my git log shows there is no such file in history:

git log --all --full-history -- "android/java_pid3568.hprof"

... empty response

However, the size of my .git folder has not been reduced:

$ du -sh .git
321M    .git

why is that?

kurtgn
  • 8,140
  • 13
  • 55
  • 91

1 Answers1

1

Commits containing the huge file are still handled by the reflog so they can't be garbage collected just by a git gc.

You could find some good commands in the answers to this question: How to remove unreferenced blobs from my git repo

like

git reflog expire --expire-unreachable=now --all
git gc --prune=now

or

git -c gc.reflogExpire=0 -c gc.reflogExpireUnreachable=0 -c gc.rerereresolved=0 -c gc.rerereunresolved=0 -c gc.pruneExpire=now gc

PS: oh, and it's always a good idea to use https://rtyley.github.io/bfg-repo-cleaner/ to do this kind of cleaning. Much easier, secure and quicker than filter-branch.

Philippe
  • 28,207
  • 6
  • 54
  • 78