3

In one of my projects (checked into a git repository) I have added a huge directory (15000 files, 3GB). When I realized this was wrong, I deleted it, but it seems like it is still in history.

Having it still there makes a project clone a very long task. Once the project is cloned, the .git directory is about 4GB but the real project size is just 15MB.

My question is: how can I tweak the history such as I make sure that 3GB directory is deleted? Or is there another way to decrease the entire project download size and speed up the clone process?

Dan D.
  • 32,246
  • 5
  • 63
  • 79
  • Just to clarify: The 3GB has been in the repository for a while, correct? The handling is different if it is not a very recent commit or if other people have already pulled commits made after the large one. – Kevin Reid Mar 24 '12 at 19:50
  • The dir was there for about a month. I deleted it another month ago. There may be a couple of files that were changed. There are only 2 developers on the project. – Dan D. Mar 24 '12 at 20:04
  • 1
    possible duplicate of [Remove files from git repo completely](http://stackoverflow.com/questions/5563564/remove-files-from-git-repo-completely) – rtn Mar 24 '12 at 20:25

2 Answers2

4

So you know which commit introduced the huge directory. Say this was done in revision AAAAAAA.

To get rid of the commit, it is not sufficcient to delete the directory (with commit BBBBBBB) and check in again: the commit AAAAAAA is still there, blowing up your repo size.

To get rid of the commit, we need git rebase. Open your git console and type

git rebase -i AAAAAAA~1

This will bring up an editor where the commit AAAAAAA is in the first line. Remove this line (i.e. when Vim is your editor, hit dd) and the commit where you removed the directory again (BBBBBBB), save the file and quit (:wqa).

After this, rebasing starts and when it has finished, AAAAAAA and BBBBBBB are no longer there. Really.

You could now also trigger some housekeeping with git gc and fetch a cup of coffee while it's running.


See also this answer: git push heroku - stop heroku pushing/uploading massive file

Community
  • 1
  • 1
eckes
  • 64,417
  • 29
  • 168
  • 201
  • Thanks for your answer. It seems like a good plan to me, but I have actually preferred to create a new repository, copy the code there and start from scratch. It was easier for me and at least I understood the process entirely :) – Dan D. Mar 27 '12 at 10:07
  • What does `-i` and `~1` mean? – qed Jan 18 '15 at 15:59
  • 1
    @qed: -i means interactive ( see docs of rebase) and ~1 means 1 revision back ( https://www.kernel.org/pub/software/scm/git/docs/gitrevisions.html) – eckes Jan 18 '15 at 16:23
1

Try to follow this guide: http://book.git-scm.com/4_undoing_in_git_-_reset,_checkout_and_revert.html

Basically, you can do

git reset --hard HEAD~1

or

git reset --hard [prev_commit]

maxk
  • 642
  • 1
  • 9
  • 21