2

So we had a user add and then remove a large directory of files in her workspace it two distinct commits. She then pushed her changes to the central repo and to the end user it looks like a noop. The problem is our central repo has jumped by 50x in size because of this. I have tried several things with filter branch and its just not working.

So the folder that was added was at the root level. The name of it was .core

I have tried the following filter-branch based on links:

http://www.somethingorothersoft.com/2009/09/08/the-definitive-step-by-step-guide-on-how-to-delete-a-directory-permanently-from-git-on-widnows-for-dumbasses-like-myself/

http://dound.com/2009/04/git-forever-remove-files-or-folders-from-history/

Remove a directory permanently from git

The final commands I have tried looks like this:

git filter-branch -f --index-filter "git rm -rf --cached --ignore-unmatch .core" --prune-empty --tag-name-filter cat -- --all
rm -Rf .git/refs/original
rm -Rf .git/refs/logs
git reflog expire --expire=now --all && git gc --prune=now --aggressive

The resulting output says the Ref is unchanged. I have tried changing the reference of .core to various variations such as ./.core, *.core/*, ..... with nothing getting removed.

Thanks.

Community
  • 1
  • 1
Peter
  • 133
  • 1
  • 6

1 Answers1

3

Use the BFG, a simpler, faster alternative to git-filter-branch, specifically designed for removing unwanted files from Git history.

Carefully follow the BFG's usage instructions - the core part is just this:

$ java -jar bfg.jar  --delete-folders .core  my-repo.git

Any folder named .core (that isn't in your latest commit) will be removed from your Git repository's history. You can then use git gc to clean away the dead data:

$ git reflog expire --expire=now --all && git gc --prune=now --aggressive

The BFG is typically at least 10-720x faster than running git-filter-branch, and generally easier to use.

Full disclosure: I'm the author of the BFG Repo-Cleaner.

Roberto Tyley
  • 24,513
  • 11
  • 72
  • 101
  • Out of curiosity: Would this tool be rewriting history? If I were to checkout the commit in question, how would Git deliver the large files to me? – Tim Biegeleisen Apr 22 '15 at 14:59
  • 1
    Ok, we use BFG to do cleaning up of big blobs, never tried it on a folder. I can confirm this works extremely well. Just as a note, there was an error on output using bfg-1.12.3: /gitstore/repositories/test/root.git.bfg-report/2015-04-22/11-02-43 Exception in thread "main" java.io.IOException: No such file or directory at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createNewFile(File.java:900) at scalax.io.support.FileUtils$$anonfun$scalax$io$support$FileUtils$$preOpen$1.apply(FileUtils.scala:66) – Peter Apr 22 '15 at 15:07
  • Peter, I get the same exception, did you find a solution? – centic Jan 22 '21 at 16:47