6

I accidentally added and committed some very large (100MB+) PSD files in a git directory. I made a bunch of edits to those files while they were in the directory, but then realized they shouldn't be there and removed them from the directory.

I then ran:

git add --all && git commit -m "Removed large psds"

The files in my directory now add up to less than a dozen MB, except for the .git file itself however, which is 700MB+.

What is going on here? Is it retaining old versions of the removed .PSD files? Does that mean git doesn't ever clear out the space gained from deleting files? How do I have it forget about those files completely so that I can bring the .git file size back down?

Roberto Tyley
  • 24,513
  • 11
  • 72
  • 101
Yarin
  • 173,523
  • 149
  • 402
  • 512

2 Answers2

14

Your .git folder is really big because the PSDs files are still present in the repository. To remove them, you need to modify the history using git filter-branch. Here explains how to use this command. After, you will need to clean the repository.

I created a script to help this job. If you want to use it, you can download it from github. Any comments are welcome.

Qix - MONICA WAS MISTREATED
  • 14,451
  • 16
  • 82
  • 145
William Seiti Mizuta
  • 7,669
  • 3
  • 31
  • 23
  • 1
    If it is just a single, recent commit, `git rebase` might be simpler to use. – vonbrand Mar 30 '13 at 03:07
  • 1
    This might be a stupid question, but after I run `git-delete`, how do I push the changes to `.git`? Tried add/commit/push and it said I needed to git pull, which just readds the files. – Austin Apr 07 '19 at 20:34
2

You want to use the BFG Repo-Cleaner, a faster, simpler alternative to git-filter-branch designed for removing large files from Git repos.

Download the BFG jar (requires Java 6 or above) and run this command:

$ java -jar bfg.jar  --strip-blobs-bigger-than 10MB  my-repo.git

Any files over 10MB in size (that aren't in your latest commit) will be removed from your Git repository's history. You can then use git gc to clean away the dead data:

$ git gc --prune=now --aggressive

The BFG is typically 10-50x faster than running git-filter-branch and the options are tailored around these two common use-cases:

  • Removing Crazy Big Files
  • Removing Passwords, Credentials & other Private data

Full disclosure: I'm the author of the BFG Repo-Cleaner.

Roberto Tyley
  • 24,513
  • 11
  • 72
  • 101