14

I have a project with ~12MB worth of code and assets in it. I've been tracking it using Git, and just noticed that my .git folder is now just over 1.83GB. It consists of a few small files, and then just one pack file that makes up about 1.82GB of the folder.

I've run git gc --aggressive and git gc --prune. It's the same size. I've tried:

git reflog expire --expire=now --all
git repack -ad  # Remove dangling objects from packfiles
git prune       # Remove dangling loose objects

But it's still the same size. I've even cloned it (once locally with a forced repack, and once again from Git), but it's still 1.83GB on each. Is that normal? Is there any way to reduce the size of it, or do I just start a new repo, copy the code over, and accept that my past commits will be gone?

Machavity
  • 30,841
  • 27
  • 92
  • 100
Bryce
  • 2,802
  • 1
  • 21
  • 46
  • 3
    By "assets", do you mean non-compressible stuff like images etc? Have you been editing those a lot? [This](http://naleid.com/blog/2012/01/17/finding-and-purging-big-files-from-git-history/) seems to be what you're after, but Disclaimer: I have never done it, so clone your repo before messing with it :p – Amadan Mar 25 '13 at 01:17
  • 6
    General VCS comment: binary files (images/ZIP...) are usually stored as is for each version (unlike text that can be very effectively packed by storing compressed diffs) and can't be compressed (as most binary formats now days are already compressed)... So it very well could be you have many versions of binary files that simply take that much space (see if you can collect some sort of per-file-type stats on number of versions/file sizes) – Alexei Levenkov Mar 25 '13 at 01:17
  • Ok, the binary file bit seems to be a likely culprit. Is there any way to remove those files from past commits, in addition to adding them to my .gitignore file? – Bryce Mar 25 '13 at 07:59
  • 2
    The Github tutorial on [removing sensitive data](https://help.github.com/articles/remove-sensitive-data) can be applied to other content you wish to remove as well. In particular, you'll need to modify the `filter-branch` command to remove the binary files you don't wish to track. Do note that this is a destructive process. – cjc343 Mar 25 '13 at 16:24

1 Answers1

20

Ok, the comments were a great start to understand what the root cause of the problem probably was. I don't really understand the git filter-branch command though, so I was a little wary of just using that.

I came across this tool: https://rtyley.github.io/bfg-repo-cleaner/

It worked wonders. My repo is now under 10MB.

Christophe Marois
  • 6,471
  • 1
  • 30
  • 32
Bryce
  • 2,802
  • 1
  • 21
  • 46