41

I have git repository with many, many (2000+) commits, for example:

                 l-- m -- n   
                /
a -- b -- c -- d -- e -- f -- g -- h -- i -- j -- k
                     \
                      x -- y -- z

and I want to truncate old log history - delete all commits from log history starting from (for example) commit "f" but as the beginning of repository.

How to do it?

Chris Maes
  • 35,025
  • 12
  • 111
  • 136
Nips
  • 13,162
  • 23
  • 65
  • 103

3 Answers3

61

In order not to lose some history; better first take a copy of your repository :). Here we go: (<f> is the sha of the commit f that you want to be the new root commit)

git checkout --orphan temp <f>      # checkout to the status of the git repo at commit f; creating a branch named "temp"
git commit -m "new root commit"     # create a new commit that is to be the new root commit
git rebase --onto temp <f> master   # now rebase the part of history from <f> to master onthe temp branch
git branch -D temp                  # we don't need the temp branch anymore

If you have a remote where you want to have the same truncated history; you can use git push -f. Warning this is a dangerous command; don't use this lightly! If you want to be sure that your last version of the code is still the same; you can run git diff origin/master. That should show no changes (since only the history changed; not the content of your files).

git push -f  

The following 2 commands are optional - they keep your git repo in good shape.

git prune --progress                 # delete all the objects w/o references
git gc --aggressive                  # aggressively collect garbage; may take a lot of time on large repos
Chris Maes
  • 35,025
  • 12
  • 111
  • 136
  • 7
    Sounds what I need but each time I run the third step (git rebase...) I get conflicts. Is it normal? – warpdesign Jul 03 '17 at 09:55
  • no, that doesn't seem normal. Did you include the `` part (that is; the same commit sha from which you created the temp branch) in that third step? – Chris Maes Jul 03 '17 at 13:43
  • 4
    @ChrisMaes, I get conflicts too. I see from commit messages that 3rd command tries to apply old commits, before – wl2776 Aug 09 '18 at 13:13
  • 2
    I'm getting conflicts as well. This might not work with big repos. – Julius Žaromskis Oct 11 '18 at 14:00
  • 2
    I am getting conflicts as well..is there any force rebase – BharathKumarRaju Dasararaju Nov 14 '19 at 06:57
  • This answered is copied&pasted all over StackOverflow, being clearly incorrect (because it explicitly generates rebase conflicts by applying old commits on top of ``). – gented Oct 19 '22 at 22:53
  • @gented. Rebasing commits doesn't automatically create merge conflicts. If you have a simple linear history, no conflict should arise (I have actually run this code). When you have merge commits in your history, then things can get complicated and merge conflicts can arise. – Chris Maes Oct 21 '22 at 12:01
  • @ChrisMaes Of course conflicts don't happen if there are no conflicts, and they happen if there are, that's a tautology :p. Your comment above _"no, that doesn't seem normal"_ is incorrect, because conflicts are exactly what to expect with this method (having files being modified before and after a certain ``) in almost the totality of practical projects. – gented Oct 21 '22 at 16:24
  • @gented. Please read my last comment again, I did not use a tautology. The words "commit" and "conflict" do not mean the same. My other comment could have been better, I agree, and it would sound the same as my last comment: for simple, linear history after commit ``, no conflicts should arise. When there are merge commits after `` (especially merge commits with code coming from before ``), then conflicts will probably arise. – Chris Maes Oct 25 '22 at 06:50
  • I am getting conflicts as well. My repo does have merge commits after `` which might be the cause. Is there a solution for such case? – Junye Huang Feb 25 '23 at 18:47
  • I get conflicts too. This clearly isn't the right way to do it. (I don't know what is though..) – intrepidis Aug 22 '23 at 10:37
20

A possible solution for your problem is provided by git clone using the --shallow-since option. If there is only a small number of commits since f and there is no trouble counting them then you can use the --depth option.

The second option (--depth) clones only the specified branch. If you need additional branches you can then add the original repo as a remote and use git fetch and to retrieve them.

When you are pleased with the result, remove the old repository and rename the new one to replace it. If the old repository is remote then re-create it after removal and push from the new repo into it.

This approach has the advantage of size and speed. The new repo contains only the commits you want and there is no need to run git prune or git gc to remove the old objects (because they are not there).

axiac
  • 68,258
  • 9
  • 99
  • 134
  • 3
    a nice alternative. +1 – Chris Maes Jan 31 '17 at 09:48
  • If you want to keep the history but on the remote only, don't do the last step. For my application, this is the best configuration: I have the bloated history on the remote in the unlikely event I need it, but locally clones and updates are quick and don't take up much disk space. – Liam Jan 23 '18 at 15:03
  • 4
    Advice with re-creation of remote did not work for me: `[remote rejected] develop -> develop (shallow update not allowed)`. – wl2776 Aug 09 '18 at 15:34
  • I tried to be clever and push the shallow clone into new branch (instead of new origin). But GitHub still remembered the "deleted" history. In other words, I recreated a branch on origin, not a whole origin, and history didn't budge. Why is that? Why do I have to recreate the origin? – Maxim Kamalov Jul 02 '21 at 01:30
  • @MaximKamalov It depends where your new branch starts from. If it starts from the current `master` then it inherits the entire history of `master`. Use a GUI Git client to see the history and the relationship between commits. – axiac Jul 02 '21 at 08:09
  • Btw, in the end I used this method: https://stackoverflow.com/questions/41953300/how-to-delete-the-old-git-history#41953383 It's more convenient in case of GitHub because the origin has Issues attached to it, it's not easy to recreate it. – Maxim Kamalov Jul 02 '21 at 12:15
4

For those who get alot of merge conflicts (and broken results) with rebase --onto I'd like recommend this script which uses git filter-branch:

#!/bin/sh

cut_sha="$1"
branch="$2"

git filter-branch \
  --parent-filter "sed -e 's/-p $cut_sha[0-9a-f]*//'" \
  --prune-empty \
  -- $branch

git for-each-ref --format='%(refname)' refs/original | \
  while read ref
  do
    git update-ref -d "$ref"
  done

git reflog expire --expire=0 --all
git repack -ad
git prune

Source: https://github.com/adrienthebo/git-tools/blob/master/git-truncate

Instructions:

  1. Save the script above to local repository root (maybe as git-truncate.sh).
  2. Check out the branch you'd like to truncate (maybe master).
  3. Go down history and find the first (newest) commit SHA you want to cut off (assume it's 2c75a32) AND ensure the commit has no branches in parallel!
  4. Run it like this: $ ./git-truncate.sh 2c75a32 master.
  5. (Push force, if any remote is present.)

IMPORTANT: The SHA must be "part" of the branch and it must be the first commit you want to delete. Don't pass the first commit you want to keep (the new "beginning of repository" commit)!

Marcel
  • 1,002
  • 2
  • 16
  • 37