3

Suppose I have a Git repository with huge trees (~60 GiB) and some history, where old versions contain many deleted files.

I now want to prune old history, but without rebaseing all the commits after the prune point, because that would take several hours for each commit.

  • Can I just delete the first commit object to remove, and hope for git gc to delete all (now unreferenced) older ones? Or will this cause panic because of missing objects?

  • Can I use git replace to replace the first commit I want to remove with a dummy commit and then call git gc?

  • Is there some other method to remove my old commits in-place?

cfstras
  • 1,613
  • 15
  • 21
  • I think any kind of history rewriting will have to involve rewriting everything after the prune point anyways, unless each commit's sha ID isn't actually based off the data of its parent. You might be able to do with more efficiently with `git filter-branch` or the BFG Repo Cleaner. –  Jul 14 '14 at 19:34
  • Each commit's SHA-1 ID *is* based on the data of its parent commits by design. You have to rebase: each commit from your prune point will need to have a completely new ID because it is based on different earlier commit IDs. – Matthew Strawbridge Jul 14 '14 at 20:37

1 Answers1

3

without rebasing all the commits after the prune point, because that would take several hours for each commit.

After Git 2.18 (Q2 2018)

Since Git 2.18, graft has been superseded by git ref/replace/): see "What are .git/info/grafts for?".

As noted by in the comments, you would use instead
git replace --convert-graft-file :

Creates graft commits for all entries in $GIT_DIR/info/grafts and deletes that file upon success.
The purpose is to help users with transitioning off of the now-deprecated graft file.

So after git rev-parse HEAD~100 > .git/info/grafts.

And git filter-branch or BFG are obsolete after Git 2.22

Install git filter-repo and use git filter-repo --force


Before Git 2.18 (Q2 2018):

That is what graft point is for (better in that particular case than git replace, as I detail here)

The file .git/info/grafts with only one line with a commit id, says that the commit doesn't have a parent.
To keep the last 100 commits, using git rev-parse:

 git rev-parse HEAD~100 > .git/info/grafts

Then:

 git filter-branch -- --all

Finally:

rm -Rf .git/refs/original

Then you can prune the rest:

git reflog expire --expire=now --all
git gc --prune=now
git gc --aggressive --prune=now
git repack -Ad      # kills in-pack garbage
git prune           # kills loose garbage
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • But grafts don't get pushed/pulled, right? It sounds great not to have to create an empty replacement commit. I'm guessing there is no way to make it so that replaces can be transferred using push/pull... – cfstras Nov 16 '17 at 11:45
  • @cfstras no, grafts are purely local. Hence the need for the `git filter-branch`, to integrate their modification into the repo history. Once that is done, you can push the changed history. (a `push --force`, since the history has been rewritten, so make sure you are the only one working on that repo, or make sure your colleagues are aware of that change) – VonC Nov 16 '17 at 11:49
  • Ah, that makes sense. I'm guessing `git filter-branch` in this case will only reprocess the commit objects (changing the parent SHA), and will thus be almost instantaneous to run? – cfstras Nov 17 '17 at 11:37
  • 1
    @cfstras yes, very quick (not "instantaneous" though, unless your history is quite small) – VonC Nov 17 '17 at 13:56
  • Wonderful! I'm not quite sure what my use-case for this question was when I asked it 3 years ago, but it's satisfying to see it answered :) – cfstras Nov 17 '17 at 14:06
  • "fatal: empty ident name (for <>) not allowed" ... :( – user1133275 Feb 15 '20 at 17:49
  • @user1133275 What version of Git are you using? – VonC Feb 15 '20 at 18:46
  • @VonC git version 2.17.1 (Ubuntu 18.04.4 LTS) this comment specifically; https://github.com/torvalds/linux/commit/af25e94d4dcfb9608846242fabdd4e6014e5c9f0 – user1133275 Feb 15 '20 at 22:58
  • git version 2.26 says that grafts is deprecated and will be removed. One is urged to turn grafts into a replace ref with `git replace --convert-graft-file`. – Kevin Buchs Jun 01 '22 at 20:31
  • @KevinBuchs Thank you for the feedback. It is 2.18 actually. I have edited the answer to update the commands. – VonC Jun 01 '22 at 20:54