20

I had to do run git filter-branch the other day. I followed the instructions on github, but something went wrong. I think someone on the team didn't run rebase on a local branch, and instead merged the changes. Ever since, the commit log is filled with double-commits, e.g.:

commit b0c03ec925c0b97150594a99861d8f21fd3ab22d
Author: XXX
Date:   Wed Mar 19 17:01:52 2014 -0400

    Removed most clearfixs in templates

commit f30c21d21b5ea715a99b0844793cb4b5f5df97a1
Author: XXX
Date:   Wed Mar 19 17:01:52 2014 -0400

    Removed most clearfixs in templates

commit 2346be43d0e02d3987331f0a9eeb2f12cd698ede
Author: XXX
Date:   Wed Mar 19 16:40:26 2014 -0400

    new redirect logic

commit 1383070b31bde1aaa9eda7c2a9bcb598dd72247b
Merge: d1e2eb6 94e07fe
Author: XXX
Date:   Wed Mar 19 16:28:41 2014 -0400

    Merge branch 'develop' of github.com:xxx/xxx into develop

commit 79ce7824688cf2a71efd9ff82e3c7a71d53af229
Merge: 6079061 1ed3967
Author: XXX
Date:   Wed Mar 19 16:28:41 2014 -0400

    Merge branch 'develop' of github.com:xxx/xxx into develop

commit d1e2eb645a4fe2a1b3986082d0409b4075a0dbc9
Author: XXX
Date:   Wed Mar 19 16:28:36 2014 -0400

    Fixed broken responsiveness for companies listing page and code refactoring.

commit 6079061f6ef1f856f94d92bc0fdacf18854b8a89
Author: XXX
Date:   Wed Mar 19 16:28:36 2014 -0400

    Fixed broken responsiveness for companies listing page and code refactoring.

Weirdly enough, not all the commits are doubled-up, such as "new redirect logic" above. Is there anything I can do to fix this? It's relatively benign, but now our commit history looks like crap. This SO post suggested just leaving it as-is, but I'd rather have a clean commit history for the sake of posterity.

ysimonson
  • 1,072
  • 2
  • 9
  • 16

2 Answers2

25

The command to accomplish that is:

git rebase -i HEAD~7

This will open up your editor with something like this:

pick f392171 Removed most clearfixs in templates
pick ba9dd9a Removed most clearfixs in templates
pick df71a27 Unew redirect logic
pick 79ce782 Merge branch 'develop' of github.com:xxx/xxx into develop
pick 1383070 Merge branch 'develop' of github.com:xxx/xxx into develop
...

Now you can tell git what to do with each commit. Let's keep the commit f392171, the one where we added our feature. We'll squash the following two commits into the first one - leaving us with one clean.

Change your file to this:

pick f392171 Removed most clearfixs in templates
squash ba9dd9a Removed most clearfixs in templates
pick df71a27 Unew redirect logic
pick 79ce782 Merge branch 'develop' of github.com:xxx/xxx into develop
squash 1383070 Merge branch 'develop' of github.com:xxx/xxx into develop

When you save and exit the editor, Git applies all two changes and then puts you back into the editor to merge the three commit messages:

# This is a combination of  commits.
# The first commit's message is:
Removed most clearfixs in templates

# This is the 2nd commit message:

Removed most clearfixs in templates

When done, save and quit your editor. Git will now squash the commits into one. All done!

Then you have to do

git push origin your-branch -f

to force your locally commits changes into remote branch.

Note: You have to do a squash to every duplicated commit.

sfletche
  • 47,248
  • 30
  • 103
  • 119
VAIRIX
  • 691
  • 5
  • 7
  • 1
    When I ran this, the duplicate commits aren't adjacent to each other. e.g. for `Removed most clearfixs in templates`, one is on line 4512, the other 6683. Is there a way to fix this? I'm also worried that there could be a case where we have two commits with the same message but different contents, in which case we don't want to squash. Is there a way to check for this? – ysimonson Apr 03 '14 at 15:50
  • I don't understand your first question... can you explain me more detailed? Otherwise, to rename the commit, you have to use 'edit' instead of squash, save the file, and then `git commit --amend`. This open an editor and you have to save the file with the new name of the commit. – VAIRIX Apr 03 '14 at 17:07
  • Regarding the first question, it's that while `git log` shows the double-commits being next to each other, the log file generated by this rebase command does not - they'll be in very different locations in the log file. For the second question, it's that over the course of history we've had commits with very different changes with the same message - e.g. merge commits will generally have the same message. I'm not interested in renaming these so much as checking which are redundant. Is there a way to show the summary of changes in the log file maybe? – ysimonson Apr 07 '14 at 12:21
  • What comes to my mind is that you could make a change the commits order first to gather those who want to do the squash. To change the order you just got to swap the lines of the commit in the file that opens the `git rebase -i`. On the other side to see if the commits are equal or not, I would use `gitg` tool, to check the commits that have the same name and verify if those commits have the same changes. I don't know if there is any tool to automatically compare. – VAIRIX Apr 07 '14 at 21:58
5

Answer by @VAIRIX is perfect but there are complex cases where duplicate commits doesn't appear adjacent to each other, so squashing won't help.

So taking below history, (assume a~ is duplicate of a)

 # h
 # g
 # f
 # c~
 # b~
 # a~
 # e
 # d
 # c
 # b
 # a

Command to follow: (as told in answer by @VAIRIX or below if you want to rebase with master) git rebase master -i (Better follow git rebase -i HEAD~n to avoid rebasing headache)

Now! 1) squash the repeated commits as below:

 pick h
 pick g
 pick f
 pick c~
 s b~
 s a~
 pick e
 pick d
 pick c
 pick b
 pick a

Now, this will squash your commits in c

 # h
 # g
 # f
 # c~ (having changes of a~ and b~)
 # e
 # d
 # c
 # b
 # a

In my case, c~ was anti-commit of c, so I just had to do the process again, but now instead of squash with s, I'm dropping the commit with d

 pick h
 pick g
 pick f
 d c~ (having changes of a~ and b~)
 pick e
 pick d
 pick c
 pick b
 pick a

Now, you history will remove all the duplicate commits. Now, you can compare with the origin branch you had using git diff which had duplicate commits against your this branch. There shouldn't be any diff if you did it perfectly.

This process might seem a little longer but you're assured that you didn't miss any commits.

master_dodo
  • 1,203
  • 3
  • 20
  • 31
  • 2
    +1 for your username too :D. and yes one should pick squash and drop very carefully, and also use the number of commits to start squashing from. HEAD~10 or something like that. – Tarandeep Singh May 08 '18 at 15:12