Does a rebase the only way to fix a wrong cherry-pick?

Question

The branch #1 contains a bug fix C1. The branch #2 first cherry-picked C1, then the branch #2 owner realised the work done in C1 was wrong actually, so he committed the correct fix C2.

In C2 he basically removed the change in C1, replaced with the correct change. When the branch #1 owner wants to "pick-up" the fix, the merge won't work. Because with merge the result C3 will then contain C1 and the correct fix introduced in C2, i.e. C1 will be keep by merge.

Because the branch #2 now does NOT contain C1 codes at all so merge won't work.

base -- C1'-- C2'(remove C1' change) -- C2''(add the correct fix) <--branch #2


     /---- C1---------------C3 (merge won't do b/c C1 will be kept)  <-- branch #1
base                       /
     \---- C1' ---- C2----/  <--branch #2

Rebase will do because it will discard C1, "a patch already accepted upstream with a different commit message or timestamp will be skipped".

      /---- C1------ xxxxx  ----C3 (Rebase will discard C1) <-- branch #1
base                       /
      \---- C1' ---- C2---/  <--branch #2

My question is actually related this, When do you use git rebase instead of git merge? Although none of answers mentioned my case. So the question is if I know someone cherry-pick my commit I better rebase his branch instead of merge ?

Is there other way to avoid (or fix) the problem I mentioned here ?

------ update-------

I understand cherry-pick is the reason why this happened. But I have faced the situation several times that I just want a specific commit now, I don't want other commits in their branch yet. So I don't know if there is a better way to do it.

score 3 · Answer 1 · answered Jul 17 '16 at 08:15

You are essentially correct. Rebase is, however, not a cure-all for the problem.

Let's draw a slightly more complex situation. We'll start with a similar commit graph:

...--o--o

The final o node here is the tip commit of some branch, with some earlier commit o also not very distinguished. We won't bother with branch labels because they are just aimed at helping humans, and we're looking at what Git does, rather than what humans do. :-)

Along comes one human who makes a new commit, your commit C1 (I'll just call it C though) that has a bug in it:

          C
         /
...--o--o

Meanwhile, in this repo, or in some other clone of it, along comes a different human who makes an unrelated commit F:

          C
         /
...--o--*
         \
          F

Commits C and F are probably on different (named) branches, but the important thing is that they have a common base * (this used to be just marked o but now we need to remember that it's the common base—though it's pretty obvious from the drawing).

In your particular scenario, the second user made F by cherry-picking C. Let's say that in our case, the second user made F quite independently. Now that second user decides that now is the time to cherry-pick C, so they get a copy of it—but it does not apply cleanly, so they change it slightly—hand-edit it—so that it applies. Now they have:

          C
         /
...--o--*
         \
          F--G

Note, again, that commit G is mostly, but not quite, a copy of C—which, as we noted, is about to be deemed defective.

Your first human therefore reverts C to, in effect, remove it from his branch, then adds D (the corrected fix):

          C--R--D
         /
...--o--*
         \
          F--G

Your second human goes on to add more commits:

          C--R--D       <-- branch1
         /
...--o--*
         \
          F--G--H--I    <-- branch2

(this time I've put in the branch names too).

When rebase works and when it fails

What git rebase does is, in essence, find commits that are in common between the two branches, and that are exclusive to each of the two branches. Your second human will come along and try to rebase the F-G-H-I sequence atop D.

The common commits start from the merge base * and work backwards; rebase gets to ignore these entirely.

The commits to be copied start after the merge base and end with the tip-most commit, hence are F, G, H, and I.

But, before copying these, Git checks the commits exclusive to the "other side": commits after the merge base * that end with D. These are C (the bad commit), R (the revert of C), and D. It uses git patch-id on each of those commits, and also on all the commits set to be copied. If the patch ID of one of the "to be copied" commits matches the patch ID of one of the "already in the chain ending with D" commits, Git drops that commit.

This is how, when commit G is an exact (not-hand-edited) copy of C, Git can drop G and just copy F, H, and I. The exact copy winds up with the same patch-ID. But this G was hand-edited to make it fit, which changed its patch-ID. Rebase therefore copies G, giving:

          C--R--D               <-- branch1
         /       \
...--o--*         F'-G'-H'-I'   <-- branch2
         \
          F--G--H--I            [abandoned]

So, while git merge definitely fails, git rebase sometimes also fails (specifically when a cherry-picked commit had to be modified to fit). In this case, that happened because of a conflict between F and the cherry-picked C, but there are plenty of ways to run into this.

Is there other way to avoid (or fix) the problem I mentioned here?

Ideally, instead of cherry-picking C in the first place, whoever is working on branch2 would just rebase onto C at that time, and then rebase onto R again later if needed (or just straight onto D), or merge after said rebase. Let's see what the graph looks like if the second human, working on branch2, had rebased his F commit onto C instead of cherry-picking. Let's draw the before-rebase:

          C    <-- branch1
         /
...--o--*
         \
          F    <-- branch2

and move C down a few lines, which is exactly the same commits, just drawn more linearly:

...--o--*---C   <-- branch1
         \
          F     <-- branch2

and now let's copy F to F' atop C and move the branch label:

...--o--*---C      <-- branch1
         \   \
          \   F'   <-- branch2
           \
            F      [abandoned]

The merge base of C and F' is now C itself, rather than commit *. Let's put the remaining commits in, unmarking the * commit and dropping abandoned commits:

...--o--o---C--R--D     <-- branch1
             \
              F'-H--I   <-- branch2

If we now use git merge to merge commit I atop commit D, we won't re-introduce bad commit C via G, since there now is no G.

Of course, if multiple people are using branch2—if the old F commit is published—this rebase-makes-a-copy thing means they must all switch to using the new copies, every time we rebase.

Testing

Is there other way to avoid (or fix) the problem I mentioned here?

Ideally, when someone found a bug, before writing commit C at all, they wrote a test case. The test case showed that commit C was required and that commit C fixed the bug, which is why commit C was committed in the first place.

When C was found to be faulty, the test case for it should have been improved, or an additional test case written, demonstrating that commit C was not quite right. This also is why revert R went in, and subsequent better fix D. (Perhaps D was, in essence, a squash of R and the replacement fix—though the fact that C got copied suggests that R should exist as a stand-alone reversion.)

These tests will now show the problem if a rebase or merge re-introduces a slight variation of commit C, such as our hypothetical commit G. That won't avoid or fix the problem itself, but will at least catch it right away.

Of course, +1, but I did not entered in the "slightly more complex situation" on purpose in my answer ;) — VonC, Jul 17 '16 at 08:32
Hi thanks for the detailed answer (+1 for sure)! But I have to say the situation you described is a bit different than mine. In my case the branch #2 guy found the commit C is buggy, fixed it and now it is the branch #1 guy wants the fix. But in your case branch #1 fixed C and branch #2 wants it. — Qiulang, Jul 17 '16 at 08:48

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

1

So the question is if I know someone cherry-pick my commit I better rebase his branch instead of merge ?

Yes because, as I stated before, a cherry-pick duplicates a commit between branches. Since a rebase skip a duplicate commit, that is a good way out.
See "Git cherry pick and datamodel integrity" for a concrete illustration.

Is there other way to avoid (or fix) the problem I mentioned here ?

If you intent to eventually merge two branches, there should be no cherry-picking between the two. Only merges or rebase.

Cherry-picking a bug-fix is a good idea only if the branches are not supposed to be merged.

edited Jun 20 '20 at 09:12

Community

1
1

answered Jul 17 '16 at 07:34

VonC

1,262,500
529
4,410
5,250

“Cherry-picking a bug-fix is a good idea only if the branches are not supposed to be merged.” Say I want that bug fix C1 and know it is correct, but I don't want other commits in his branch (yet), then what is the better way than cherry-pick ? – Qiulang Jul 17 '16 at 07:54
@Qiulang in that case no, provided you can rebase the branch later (that won't duplicate the cherry-picked commit) – VonC Jul 17 '16 at 07:56
OK I will say I basically figure out my own question (and I think I know cherry-pick pretty well.) But I will give your answer the green tick mark for answering my question and confirming it :) – Qiulang Jul 17 '16 at 07:59

Does a rebase the only way to fix a wrong cherry-pick?

2 Answers2

When rebase works and when it fails

Testing