We have a git repo containing both source code and binaries. The bare repo has now reached ~9GB, and cloning it takes ages. Most of the time is spent in "remote: Compressing objects". After a commit with a new version of one of the bigger binaries, a fetch takes a long time, also spent compressing objects on the server.
After reading git pull without remotely compressing objects I suspect delta compression of binary files is what hurts us as well, but I'm not 100% sure how to go about fixing this.
What are the exact steps to fix the bare repo on the server? My guess:
- Add entries like '*.zip -delta' for all extensions I want to into .git/info/attributes
- Run 'git repack', but with what options? Would -adF repack everything, and leave me with a repo where no delta compression has ever been done on the specified file types?
- Run 'git prune'. I thought this was done automatically, but running it when I played around with a bare clone of said repo decreased the size by ~2GB
- Clone the repo, add and commit a .gitattributes with the same entries as I added in .git/info/attributes on the bare repo
Am I on to something?
Update:
Some interesting test results on this. Today I started a bare clone of the problematic repo. Our not-so-powerful-server with 4GB ram ran out of memory and started swapping. After 3 hours I gave up...
Then I instead cloned a bare repo from my up-to-date working copy. Cloning that one between workstations took ~5 minutes. I then pushed it up to the server as a new repo. Cloning that repo took only 7 minutes.
If I interpret this correctly, a better packed repo performs much better, even without disabling the delta-compression for binary files. I guess this means the steps above are indeed what I want to do in the short term, but in addition I need to find out how to limit the amount of memory git is allowed to use for packing/compression on the server so I can avoid the swapping.
In case it matters: The server runs git 1.7.0.4 and the workstations run 1.7.9.5.
Update 2:
I did the following steps on my testrepo, and think I will chance to do them on the server (after a backup)
Limit memory usage when packing objects
git config pack.windowMemory 100m
git config pack.packSizeLimit 200mDisable delta compression for some extensions
echo '*.tar.gz -delta' >> info/attributes
echo '*.tar.bz2 -delta' >> info/attributes
echo '*.bin -delta' >> info/attributes
echo '*.png -delta' >> info/attributesRepack repository and collect garbage
git repack -a -d -F --window-memory 100m --max-pack-size 200m
git gc
Update 3:
Some unexpected side effects after this operation: Issues after trying to repack a git repo for improved performance