I have a commit with the ID 56f06019
, for example. In that commit I have accidentally committed large file (50 MB). In another commit I added the same file but in the right size (small). Now my repo is too heavy when I clone. How do I remove that large file from the repo history to reduce the size of my repo?

- 324
- 2
- 13

- 27,328
- 49
- 143
- 192
-
in my case, its not a large file, but a configuration file containing database credits. I was studying git, at that time I were unaware of .gitignore. – Rashi Nov 15 '13 at 09:14
-
1possible duplicate of [How to remove/delete a large file from commit history in Git repository?](http://stackoverflow.com/questions/2100907/how-to-remove-delete-a-large-file-from-commit-history-in-git-repository) – Apr 04 '14 at 00:34
-
1related https://help.github.com/articles/removing-sensitive-data-from-a-repository/ – Trevor Boyd Smith Jun 21 '18 at 18:59
5 Answers
Chapter 9 of the Pro Git book has a section on Removing Objects.
Let me outline the steps briefly here:
git filter-branch --index-filter \
'git rm --cached --ignore-unmatch path/to/mylarge_50mb_file' \
--tag-name-filter cat -- --all
Like the rebasing option described before, filter-branch
is rewriting operation. If you have published history, you'll have to --force
push the new refs.
The filter-branch
approach is considerably more powerful than the rebase
approach, since it
- allows you to work on all branches/refs at once,
- renames any tags on the fly
- operates cleanly even if there have been several merge commits since the addition of the file
- operates cleanly even if the file was (re)added/removed several times in the history of (a) branch(es)
- doesn't create new, unrelated commits, but rather copies them while modifying the trees associated with them. This means that stuff like signed commits, commit notes etc. are preserved
filter-branch
keeps backups too, so the size of the repo won't decrease immediately unless you expire the reflogs and garbage collect:
rm -Rf .git/refs/original # careful
git gc --aggressive --prune=now # danger

- 457,139
- 39
- 126
- 163

- 374,641
- 47
- 450
- 633
-
1It's worth noting that this doesn't seem to work under windows cmd.exe. Seems to work under cygwin fine, though. – Fake Name Nov 16 '13 at 05:29
-
2I got the above git filter-branch to work by using double-quotes instead of single-quotes (on Windows Server 2012 cmd.exe) – JCii Dec 19 '13 at 07:09
-
2What worked for me was this filter-branch command line. `git filter-branch --force --index-filter 'git rm --ignore-unmatch --cached PathTo/MyFile/ToRemove.dll' -- fbf28b005^..` Then `rm --recursive --force .git/refs/original` and `rm --recursive --force .git/logs` Then I used the `git prune --expire now` and `git gc --aggressive` This worked better for me than your exact steps listed above. Thank you for including the link to the Git Pro book as it was invaluable. – dacke.geo Nov 16 '15 at 16:53
-
After the filter-branch command, the only way I could get the size of the .git folder down was to follow the command found here: http://stackoverflow.com/questions/1904860/how-to-remove-unreferenced-blobs-from-my-git-repo git -c gc.reflogExpire=0 -c gc.reflogExpireUnreachable=0 -c gc.rerereresolved=0 \ -c gc.rerereunresolved=0 -c gc.pruneExpire=now gc "$@" – Steve Ardis Mar 14 '16 at 17:39
-
For shrinking the repo, I used the commands listed in git filter-branch doc : https://git-scm.com/docs/git-filter-branch#_checklist_for_shrinking_a_repository – Ludovic Ronsin Sep 04 '17 at 13:20
-
@TannerBabcock with that amount of detail nobody is going to be able to help. The book is well known, it helped enough people facing the same problems, so I bet there's just an error with expectations. Why don't you just ask for help with your situation, instead of unproductively complaining at existing posts? We're all here to help – sehe Dec 12 '18 at 19:00
-
For me, this worked exactly as posted by Matthew. The only thing I had to change was the filename. – jollycat May 11 '20 at 10:22
-
-
This worked for me but I then got a "unrelated histories" error when trying to pull before I pushed. I solved this by git pull --allow-unrelated-histories – user7722867 Mar 08 '21 at 10:27
-
@user7722867 Be careful. That's not "solving" the fact that histories are unrelated - it merely silences the warning. The thing about rebasing is that it rewrites your history. You will need to rebase all branches onto this rewrite so that you can have normal branch/merge, Otherwise it can be useful to have "unrelated" branches only if you intend to (a) keep independent branches that never interact (b) keep unrelated branches for purposes of e.g. cherry-picking individual commits. – sehe Mar 08 '21 at 14:22
-
Well, I tried this on my repo where I had hundreds of images, and somehow it actually increased the size by about double! I followed the exact steps of this answer, hopefully the references in the comments will still solve it for me – Koen Jun 09 '21 at 16:43
-
@KoenduBuf It probably means you have to garbage collect. Git will do this periodically, but you can do this if you're SURE you don't need the now-rewritten commits in the old form. (It does make sense that a version control system tries not to lose your old data see e.g. https://stackoverflow.com/questions/7654822/remove-refs-original-heads-master-from-git-repo-after-filter-branch-tree-filte) – sehe Jun 09 '21 at 16:59
-
I did garbage collect, in about 200 ways, it did not solve the problem for me. Here is (simplified) what happened: I basically had 2 giant files, but only used the first first command on one of them, since I forgot I had a second giant file. Then die some garbage collection, which only increased the size of my stuff for some reason (maybe it stored the other giant file multiple times, idk?) all was solved when I removed the other giant file as well – Koen Jun 09 '21 at 23:35
-
oops, why to use --all flag? wanted to test on the test branch, didn't notice that – Alexander Myasnikov Jun 02 '23 at 18:53
-
1@AlexanderMyasnikov because usually files are being removed for important reasons (like, they're big or contain sensitive information). Unless you process all branches, the file will still be in the repository. Also, good thing that you still have the backup after `filter-branch`. – sehe Jun 03 '23 at 01:30
You can use git-extras tool. The obliterate command completely remove a file from the repository, including past commits and tags.

- 5,094
- 2
- 22
- 22
-
2
-
-
This had to rewrite the entire history which was about 30000 commits, even though the files I want to remove were just 5 commits old. – thanos.a Sep 20 '22 at 12:33
-
`git obliterate` does basically [the same](https://github.com/tj/git-extras/blob/6d73f74f83e734276b32e408f21214f489e812a3/bin/git-obliterate#L4) as the [accepted answer](https://stackoverflow.com/a/8741530/1319821). – Y. E. May 07 '23 at 12:42
I tried using the following answer on windows https://stackoverflow.com/a/8741530/8461756
Single quote does not work on windows; you need double-quotes.
Following worked for me.
git filter-branch --force --index-filter "git rm --cached --ignore-unmatch PathRelativeRepositoryRoot/bigfile.csv" -- --all
After removing the big file, I was able to push my changes to GitHub master.

- 401
- 2
- 8
- 18

- 799
- 7
- 12
-
somehow `.\relative\path\to\file*` doesn't work for me. I need to use `*file*` instead – Ooker Dec 16 '22 at 16:04
You will need to git rebase in the interactive mode see an example here: How can I remove a commit on GitHub? and how to remove old commits.
If your commit is at HEAD minus 10 commits:
$ git rebase -i HEAD~10
After the edition of your history, you need to push the "new" history, you need to add the +
to force (see the refspec in the push options):
$ git push origin +master
If other people have already cloned your repository, you will to inform them, because you just changed the history.

- 1
- 1

- 536
- 2
- 4
-
4That does **not** remove the large file from history. Also, the canonical way to force push is `git push --force` or `git push -f` (which doesn't require people to know the branch push target) – sehe Jan 05 '12 at 11:13
-
Based on the question, the new file is exactly the same as the old file, that is, the same path. This is why you cannot directly use `git rm` on the path. – Loïc d'Anterroches Jan 05 '12 at 12:22
-
2@sehe, if you do a rebase eliminating the commit with the huge file, it is gone for good. – vonbrand Feb 07 '13 at 01:33
-
@vonbrand only from that branch that you rebased. I'm not assuming the 'from' branch gets deleted. But yeah, if you delete a revision tree branch, that will help :_ – sehe Feb 07 '13 at 12:38
-
@sehe, sure, you have to chase down all branches containing the offending commit. If it is before some bushiness in the repo, you'll have a lot of reorganizing to do. But rebase _is_ the tool for this. – vonbrand Feb 07 '13 at 13:02
-
mmm I guess in the context of a single commit, using rebase is as good, and probably easier to explain. My answer is just more general. Imagine a large binary file that changed over many commits over the years. You wouldn't want to be manually rebasing your way of there. @vonbrand Thanks for a fruitful discussion – sehe Feb 07 '13 at 13:10
You can use a simple Command to deleted
git rm -r -f app/unused.txt
git rm -r -f yourfilepath

- 302
- 3
- 7
-
This will leave the file in the history. The question is to remove the file from the history as well. So it was like the file was never added in the first place. – Simon Brown Jan 20 '23 at 14:15