393

I tried looking for a good tutorial on reducing repo size, but found none. How do I reduce my repo size...it's about 10 MB, but the thing is Heroku only allows 50 MB and I'm no where near finished developing my app.

I added the usual suspects (log, vendor, doc etc) to .gitignore already. Although I only added .gitignore recently.

Any suggestions?

Venkat
  • 2,549
  • 2
  • 28
  • 61
sent-hil
  • 18,635
  • 16
  • 56
  • 74
  • 1
    I just did and it brought it down to 2.2 mb...thanks a lot! Although that didn't seem to reduce the repo size on Heroku..hmm – sent-hil Jan 22 '10 at 11:16
  • 11
    Push it using --force. It will overwrite the contents even if there was no change (no new commits, etc.) – Marcin Gil Jan 22 '10 at 11:21
  • 1
    @MarcinGil - Below, VonC states you need access to the server to clean the remote server (if I am parsing it correctly). – jww Jun 16 '16 at 11:34
  • 2
    Just a comment to help other readers if they don't know what to add to the `.gitignore`, there is a nice service at gitignore.io that will help you set up a good `.gitignore` based on your dev environment. – Blairg23 Jan 15 '17 at 22:27

4 Answers4

465

Update Feb. 2021, eleven years later: the new git maintenance command (man page) should supersede git gc, and can be scheduled.


Original: git gc --aggressive is one way to force the prune process to take place (to be sure: git gc --aggressive --prune=now). You have other commands to clean the repo too. Don't forget though, sometimes git gc alone can increase the size of the repo!

It can be also used after a filter-branch, to mark some directories to be removed from the history (with a further gain of space); see here. But that means nobody is pulling from your public repo. filter-branch can keep backup refs in .git/refs/original, so that directory can be cleaned too.

Finally, as mentioned in this comment and this question; cleaning the reflog can help:

git reflog expire --all --expire=now
git gc --prune=now --aggressive

An even more complete, and possibly dangerous, solution is to remove unused objects from a git repository


Note that git filter-repo now (Git 2.24+, Q4 2019) replaces the obsolete git filter-branch or BFG: it is a python-based tool, to be installed first.

Joe suggests:

# Find the largest files in .git:
git rev-list --objects --all | grep -f <(git verify-pack -v  .git/objects/pack/*.idx| sort -k 3 -n | cut -f 1 -d " " | tail -10)

# Strat filtering these large files:
git filter-repo --path-glob '../../src/../..' --invert-paths --force
#or
git filter-repo --path-glob '*.zip' --invert-paths --force
#or
git filter-repo --path-glob '*.a' --invert-paths --force

git remote add origin git@github.com:.../...git
git push --all --force
git push --tags --force
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • In another scenario, see also http://stackoverflow.com/questions/1029969/why-is-my-git-repository-so-big – VonC Jan 22 '10 at 11:40
  • 2
    Note to self: don't forget remote branches: http://stackoverflow.com/questions/11255802/delete-remove-binary-file-from-git-repository-is-still-large – VonC Jun 29 '12 at 06:26
  • 2
    Note to self: don't forget remote tags – saiyancoder Oct 06 '14 at 06:27
  • 1
    In addition to remote references, the reflog is another thing that may cause references you are trying to remove to be kept. http://stackoverflow.com/q/27489761/1072626 – vossad01 Dec 15 '14 at 18:03
  • @vossad01 good point. I have included your comment in the answer for more visibility. – VonC Dec 15 '14 at 18:14
  • @VonC I started `git gc --aggressive` but at the middle of the process it was interrupted because space at HDD was over and now my git repository has grown in size - and now I can not add more space - how can I fix this? – Vitaly Zdanevich May 10 '16 at 09:25
  • 1
    @VitalyZdanevich Note sure: ask a new question with the exact version of git and the OS used, to see if I or other git-questions contributors can suggest a fix. – VonC May 10 '16 at 09:27
  • Do we do a `git push` after this to push the changes to the server? I'm guessing not since it results in *`Everything up-to-date`*. Is there a way to clean the server directly? – jww Jun 16 '16 at 10:53
  • 4
    @jww I confirm this is purely a local operation. It has no bearing on the size of the remote repo. You would need a direct access to the server of that remote repo to do the same. – VonC Jun 16 '16 at 10:56
  • I reversed the order above and clearing the reflog didn't help much when done after the garbage collection and pruning, which reduced my repo from 10 to 5 MB. Perhaps it should be done in the order specified. ;-) – Tom Russell Jun 22 '17 at 20:08
  • 1
    I ran the original gc bits on my repo. My `.git` folder dropped from 1.7GB to 235MB. Great tip @VonC – Chase Florell Dec 30 '21 at 19:52
120

Thanks for your replies. Here's what I did:

git gc
git gc --aggressive
git prune

That seemed to have done the trick. I started with around 10.5MB and now it's little more than 980KBs.

Boris Yakubchik
  • 3,861
  • 3
  • 34
  • 41
sent-hil
  • 18,635
  • 16
  • 56
  • 74
  • 11
    `prune` is always run by `gc` (with 2 weeks ago default). – Cas Oct 10 '12 at 12:21
  • 147
    U can run all 3 with prune till now using `git gc --aggressive --prune=now` – rahul286 Oct 19 '12 at 18:44
  • 5
    But, when I delete the repo then clone it again, the size is still large. How do you handle that? – cwtuan Jan 04 '19 at 15:43
  • if you delete your local repository and clone again you inherit the remote's .git folder. To keep the size reduction changes you likely have to at least push them yourself first. If you don't control the remote you're out of luck, but you could always make your own fork – rjm27trekkie Jul 31 '20 at 22:23
  • 1
    after run the three commands, local repo became smaller, but run git status will show no update at all ,so it's no way to git commit & git push changes to remote repo... How to shrink remote repo? – Bruce Yang Sep 28 '21 at 02:55
31

In my case, I pushed several big (> 100Mb) files and then proceeded to remove them. But they were still in the history of my repo, so I had to remove them from it as well.

What did the trick was:

bfg -b 100M  # To remove all blobs from history, whose size is superior to 100Mb
git reflog expire --expire=now --all
git gc --prune=now --aggressive

Then, you need to push force on your branch:

git push origin <your_branch_name> --force

Note: bfg is a tool that can be installed on Linux and macOS using brew:

brew install bfg
vvvvv
  • 25,404
  • 19
  • 49
  • 81
  • 1
    that is such a clean solution. 'git gc' and 'git prune' didn't help me from other answers. – Asim Apr 01 '22 at 16:08
1

This should not affect everyone, but one of the semi-hidden reasons of the repository size being large could be Git submodules.

You might have added one or more submodules, but stopped using it at some time, and some files remained in .git/modules directory. To give redundant submodule files away, see this question.

However, just like the main repository, the other way is to navigate to the submodule directory in .git/modules, and do, for example, git gc --aggressive --prune.

These should have a good impact on the repository size, but as long as you use Git submodules, e.g. especially with large libraries, your repository size should not change drastically.

MAChitgarha
  • 3,728
  • 2
  • 33
  • 40