42

I have a very large Git repository which only contains binary files which are changed pretty often. Naturally the Git repository is much larger than the actual files in it. I don't really care about old history, I only need some of the newer history to be able to revert some wrong changes. So let's say I want to remove all commits except the last five.

Naturally I want to do this to keep the repository small so the removed commits must be purged completely from the repo.

And I want to do all this non-interactively with a single command (alias) or script. How can I do this?

kayahr
  • 20,913
  • 29
  • 99
  • 147
  • 2
    This might help: http://stackoverflow.com/questions/250238/collapsing-a-git-repositorys-history – Herr von Wurst Aug 13 '12 at 07:33
  • 4
    Are you sure you want to _remove_ all the old commits? It means also removing their changes. GIT doesn't store "current state" in each commit, it only stores a change. What you want to do is rather squash all the old commits into one, isn't it? – amorfis Aug 13 '12 at 09:21

2 Answers2

21

Here's an rebase-last-five alias to get you started. It will recreate the current branch so only the most recent five commits are in the history. It's probably best to make this a script (git-rebase-last-five.sh) that is available in a directory on your PATH; Git will find and use scripts named git-....sh without the need for any special configuration. The script should do more error checking and handling than this simple alias.

$ git config --global alias.rebase-last-five '!b="$(git branch --no-color | cut -c3-)" ; h="$(git rev-parse $b)" ; echo "Current branch: $b $h" ; c="$(git rev-parse $b~4)" ; echo "Recreating $b branch with initial commit $c ..." ; git checkout --orphan new-start $c ; git commit -C $c ; git rebase --onto new-start $c $b ; git branch -d new-start ; git gc'

CAVEAT EMPTOR: Do heed the warnings about changing history.

Check the man pages (git help <command> or online) for further information.

An example usage:

$ git --version
git version 1.7.12.rc2.16.g034161a
$ git log --all --graph --decorate --oneline
* e4b2337 (HEAD, master) 9
* e508980 8
* 01927dd 7
* 75c0fdb 6
* 20edb42 5
* 1260648 4
* b3d6cc8 3
* 187a0ef 2
* e5d09cf 1
* 07bf1e2 initial
$ git rebase-last-five 
Current branch: master e4b2337ef33d446bbb48cbc86b44afc964ba0712
Recreating master branch with initial commit 20edb42a06ae987463016e7f2c08e9df10fd94a0 ...
Switched to a new branch 'new-start'
[new-start (root-commit) 06ed4d5] 5
 1 file changed, 1 insertion(+)
 create mode 100644 A
First, rewinding head to replay your work on top of it...
Applying: 6
Applying: 7
Applying: 8
Applying: 9
Deleted branch new-start (was 06ed4d5).
Counting objects: 35, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (15/15), done.
Writing objects: 100% (35/35), done.
Total 35 (delta 4), reused 0 (delta 0)
$ git log --all --graph --decorate --oneline
* a7fb54b (HEAD, master) 9
* 413e5b0 8
* 638a1ae 7
* 9949c28 6
* 06ed4d5 5
Go Dan
  • 15,194
  • 6
  • 41
  • 65
  • 4
    Works great! Thanks! But instead of using just `git gc` I had to use `git reflog expire --expire=now --all; git gc --prune=now` to actually make the repository smaller. – kayahr Aug 17 '12 at 15:52
  • 1
    Thanks a lot, it works. You should decompose the script to several lines in your answer. I personally prefer understand what I am doing to my repo before executing an external script. – JulienD Jul 14 '16 at 14:13
  • 1
    This worked well for me after one fix. `git branch --no-color | cut -c3-` did not return the current branch for me; I changed the script to use `git rev-parse --abbrev-ref HEAD`. The complete alias I used, then, was: `git config --global alias.rebase-last-five '!b="$(git rev-parse --abbrev-ref HEAD)" ; h="$(git rev-parse $b)" ; echo "Current branch: $b $h" ; c="$(git rev-parse $b~4)" ; echo "Recreating $b branch with initial commit $c ..." ; git checkout --orphan new-start $c ; git commit -C $c ; git rebase --onto new-start $c $b ; git branch -d new-start ; git gc'` – greg_1_anderson Sep 30 '18 at 19:37
  • git-rebase-last-five.sh where should i place this file – kishan verma Mar 20 '20 at 12:44
14

Ok, if you want what I think you want (see my comment), I think this should work:

  1. Create branch to save all the commits (and just in case):

    git branch fullhistory

  2. While still on master, reset --hard to the commit you want to retain history from:

    git reset --hard HEAD~5

  3. Now reset without --hard to the beginning of history, this should leave your workspace untouched, so it remains in HEAD~5 state.

    git reset --soft <first_commit>

  4. So now you have empty history on master, and all the changes you need in the workspace. Just commit them.

    git commit -m "all the old changes squashed"

  5. Now cherry-pick this 4 commits from fullhistory you want to have here:

    git cherry-pick A..B

Where A is older than B, and remember A is NOT included. So it should be parent of the oldest commit you want to include.

Community
  • 1
  • 1
amorfis
  • 15,390
  • 15
  • 77
  • 125