50

In my personal git repo, I have a directory that contains thousands of small images that are no longer needed. Is there a way to delete them from the entire git history? I have tried

git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch imgs" HEAD

and

git filter-branch --tree-filter 'rm -fr imgs' HEAD

but the size of the git repo remains unchanged. Any ideas?

Thanks

R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
adk
  • 4,479
  • 9
  • 36
  • 38

7 Answers7

33

The ProGit book has an interesting section on Removing Object.

It does end with this:

Your history no longer contains a reference to that file.
However, your reflog and a new set of refs that Git added when you did the filter-branch under .git/refs/original still do, so you have to remove them and then repack the database. You need to get rid of anything that has a pointer to those old commits before you repack:

$ rm -Rf .git/refs/original
$ rm -Rf .git/logs/
$ git gc
$ git prune --expire 

(git prune --expire is not mandatory but can remove the directory content from the loose objects)
Backup everything before doing those commands, just in case ;)

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
15

Actually none of these techniques workedfor me. I found the most reliable was was to simply pull locally into another repo:

git pull file://$(pwd)/myGitRepo

It also saves you the hassle of deletig old tags.

see the story on my blog: http://stubbisms.wordpress.com/2009/07/10/git-script-to-show-largest-pack-objects-and-trim-your-waist-line/

Antony Stubbs
  • 13,161
  • 5
  • 35
  • 39
  • This seems to be the deal close for me. I have documented the Windows specific steps here: http://www.somethingorothersoft.com/?p=80 – Igor Zevaka Sep 08 '09 at 01:45
  • 1
    The question was "Is there a way to delete them (files in directory) from the entire git history?" How does this answer help do that? – hitautodestruct Dec 09 '19 at 12:30
  • @hitautodestruct because the question correctly shows how to remove the object from the active tree, but is missing the how to remove the _dangling_ references to those files. Thanks for the blast from the past - zombie answer from ten years ago =D – Antony Stubbs Dec 18 '19 at 13:58
  • I know it's an oldy, but I have no idea how this answers the question. The question asks how to modify an existing git repo to remove a folder, and this just shows how to pull into a different repo - this clearly does not directly answer the question – PandaWood Jun 05 '21 at 05:48
13

git-filter-branch by default saves old refs in refs/original/* namespace.

You need to delete them, and then do git gc --prune=now

iwasrobbed
  • 46,496
  • 21
  • 150
  • 195
Jakub Narębski
  • 309,089
  • 65
  • 217
  • 230
10

Brandon Thomson asked in a comment to Rainer Blome's solution if this just fixed the gitk view or if the refs will be really gone. A good way to check this is to remember one of the sha1 hashes (or a unique prefix of it) of the old commits and try

$ git ls-tree hash-value

This should show you the content of the repos main folder as it was in this commit. After

$ rm -Rf .git/refs/original
$ rm -Rf .git/logs/

as shown by VonC and removing the refs/original/… lines from .git/info/refs and .git/packed-refs as shown by Rainer Blome, a final

$ git gc --prune=now

made not only the refs, but also the old objects (commits, trees, and blobs) go away. The above shown git ls-tree hash-value proves this. Another nice command to check this is git count-objects -v (run it before the filter-brach and after the pruning and compare the size).

Note: As I'm not allowed yet to comment on the other answers, I had to write a new one although it mainly combines previous given answers.

Michael
  • 1,502
  • 19
  • 29
  • This answer *seems* like the correct solution to me. However, I don't understand why the total size of my repository is unchanged. – dbn Feb 15 '13 at 23:54
3

If you want to go the manual cleanup route, there are some more files that may also contain a ref to the position of your original branch before the git-filter-branch. For example, I filtered my "home" branch:

.git/info/refs:

179ad3e725816234a7182476825862e28752746d refs/original/refs/heads/home

.git/packed-refs:

179ad3e725816234a7182476825862e28752746d refs/original/refs/heads/home

After I removed those lines, gitk did not show the old commits any more.

  • 1
    worked for me, although I kindof wonder if this just fixed the gitk view or if the refs will actually be gc'd now – gravitation Nov 20 '09 at 14:09
2

As this is an old question, perhaps some of this wasn't possible back then. This also assumes you're using bash or cygwin.

Warning: The second and third lines will permanently delete all commits unreachable from your branches/tags.

After running filter-branch, do

for ref in $(git for-each-ref --format='%(refname)' refs/original); do git update-ref -d $ref; done
git reflog expire --expire=now --all
git gc --prune=now

git for-each-ref --format='%(refname)' gets the reference names, and git update-ref -d deletes the reference. It is generally better not to modify the .git folder directly, and in particular this command handles the case when the refs are in packed-refs.

The second and third lines are taken directly from How to clean up unused side-branches in your commit trees?.

Community
  • 1
  • 1
Zantier
  • 833
  • 1
  • 8
  • 18
1

Answer for the year 2021

This surprisingly turns out to be hard task. Google turns up pages that are way back dated to 2009 and StackOverflow discussions almost a decade old. Lot of those things don't work any more!

Here's what works (also recommended way according to git docs):

First install git-filter-repo:

pip install git-filter-repo

Next, delete folders from git history. This will rewrite entire Git history except for the excluded folder!

git filter-repo --force --invert-paths --path to/folder1 --path to/folder

Next, add back the remotes:

git remote add origin https://...

Next, force push upstream:

git push --force --set-upstream origin master

So that's the bunch of commands but I haven't found a shorter better way.

Shital Shah
  • 63,284
  • 17
  • 238
  • 185