6

Unfortunately we accidently checked in a large binary file some time ago and until today nobody noticed. Now I want to drop that commit and have the remaining history as it is. I know about the caveats of changing pushed history but in this case I cannot avoid it.

I have been trying to achieve that for ~1h but fail to get it. The best command I found is

git rebase --interactive --preserve-merges $(EVIL_COMMIT)^

and in the editor commenting out the 1st commit which is the evil one.

Unfortunately git rebase stops at merges and prompts for manual resolution of merge conflicts. The evil commit only adds some example files our software shall compute for testing purposes. Thus their shouldn't be any conflict with the example files just missing.

  1. I do not understand where the merge conflicts originate from. Somebody can explain?
  2. How to resolve that?

I've spent quite a lot of time at Google and SO search. Some threads cover a similar topic but either syntax used is not available in today's Git version anymore or it didn't work for me (I only described one method above because it's the easiest approach).

Daniel Böhmer
  • 14,463
  • 5
  • 36
  • 46
  • possible duplicate of [How to remove/delete a large file from commit history in Git repository?](http://stackoverflow.com/questions/2100907/how-to-remove-delete-a-large-file-from-commit-history-in-git-repository) –  Apr 04 '14 at 00:51

2 Answers2

7

I'd go with filter-branch:

git filter-branch --prune-empty --index-filter '
  git rm --cached --ignore-unmatch path/to/file
' --all
knittl
  • 246,190
  • 53
  • 318
  • 364
  • This does only remove the file `path/to/file` from all revisions, right? As far as I understand it doesn't remove the commit which will be left empty after deleting the binaries (there are actually 4 of them). – Daniel Böhmer Nov 01 '11 at 17:09
  • 1
    I think you mean `HEAD` rather than `$(EVIL_COMMIT)^` in this answer - otherwise it would only filter the commits up to the one before the one that introduced the large file. – Mark Longair Nov 01 '11 at 17:30
  • The option for `git rm` is `--ignore-unmatch` without *~ed*. – Daniel Böhmer Nov 01 '11 at 17:46
  • I successfully removed the commit introducing the binaries from my repo with `git filter-branch --index-filter ' git rm --cached --ignore-unmatch --prune-empty file{A,B,C,D} ' HEAD`. I did that for all branches including the ones on upstream. Unfortunately `git prune` does not remove the objects but I cannot find any references to the files. I tried http://stackoverflow.com/questions/460331/git-finding-a-filename-from-a-sha1/460417#460417 but it doesn't list any commits containing the binaries. :-/ – Daniel Böhmer Nov 01 '11 at 19:27
  • Thanks for all the comments! @halo: you'd have to expire the reflogs and remove the `refs/original` ref namespace (kept by git for safety reasons) – knittl Nov 01 '11 at 19:57
  • Finally got it by following http://progit.org/book/ch9-7.html#removing_objects The large blob was still referenced by commits in the logs... – Daniel Böhmer Nov 01 '11 at 20:29
  • Will this also make the clone smalle – lesolorzanov Nov 04 '16 at 13:50
0

My friend started a convenient script for purging files from git history, check https://github.com/donquixote/gitpurge

the
  • 21,007
  • 11
  • 68
  • 101