How rebase result may differ from result of a merge?

Question

In one of GitHub's articles I read the following:

You aren't able to automatically rebase and merge on GitHub when: Rebasing the commits is considered "unsafe", such as when a rebase is possible without merge conflicts but would produce a different result than a merge would.

It isn't clear for me how a rebase may produce a different result than a merge.

Can anyone explain how is it possible?

Link to the original article: https://help.github.com/articles/about-pull-request-merges/

A merge typically combines the work of 2 branches whereas a rebase rewinds commits made in a branch, switches heads, and then replays those commits on top of the new head. Rebase will rewrite history and a merge appends history. — castis, Jun 14 '17 at 14:05
@castis Thank you for your answer. I understand underlying details of the work of these both operations, though. I'm asking about the final result; a state which the code will be in. How it may depend on the operation I perform? — Victor Dombrovsky, Jun 14 '17 at 14:25
@castis These two operations will always produce different results if we would consider the results as you do. But the sentence, I cited, states that they differ in some cases only, and in those "some" cases the rebase is considered "unsafe". — Victor Dombrovsky, Jun 14 '17 at 14:31
It's not immediately clear to me what they mean. A link to the actual article might help. — torek, Jun 14 '17 at 14:35
After reading that I am now curious as to the outcome of this question. I also edited in the beginning of the statement from github to increase question clarity. — castis, Jun 14 '17 at 17:10
OK, I see they have separated out merge conflict cases, and they are talking about commits with a defined merge base, so that leaves just one possibility: rebasing a chain of commits that contain their own merges with their own resolutions. (Or maybe two: rebasing when the merge base is a virtual merge base. I think that case has no conflict-free different-results, but inner conflicts are "hidden" so there may be some path here.) — torek, Jun 14 '17 at 17:37
@castis: I've added an illustrative script and some graph drawings of at least one way to come up with a different final result. — torek, Jun 14 '17 at 21:05
As of today, there seem to be at least two fundamentally unrelated ways in which this could arise - at which point you have to start wondering if there are others, too. So perhaps the *best* question to ask is: "How does GitHub go about deciding that a given rebase-and-merge operation is subject to this condition?" (Which I guess only GitHub can answer.) Unless they actually perform both operations and compare the results, I'm skeptical that they can consistently apply this rule — Mark Adelsberger, Jun 30 '17 at 17:15

torek · Accepted Answer · 2017-06-30T16:55:56.207

Here's a construction proof of a case where rebase and merge produce different results. I assume this is the case they are talking about. Edit: There is another case that can occur, when merging branches where the side branch to be rebased has-or-merged contains one commit that will be skipped (due to patch-ID matching) during a rebase, followed by a reversion of that commit (that will not be skipped). See Changes to a file are not retained by merge, why? If I have time later I will try to add a construction proof for that example as well.

The trick is that since rebase copies commits but omits merges, we need to drop a merge whose resolution is not simple composition of its predecessors. For this merge to have had no conflicts, I think it must be an "evil merge", so this is what I put into the script.

The graph we build looks like this:

  B   <-- master
 /
A--C--E   <-- branch
 \   /
  \ /
   D   <-- br2

If you are on master (your tip commit is B) and you git merge branch, this combines the changes from diffing A-vs-B with those from diffing A-vs-E. The resulting graph is:

  B-----F   <-- master
 /     /
A--C--E   <-- branch
 \   /
  \ /
   D   <-- br2

and the contents of commit F are determined by those of A, B, and E.

If you are on branch (your tip commit is E) and you git rebase master, this copies commits C and D, in some order (it's not clear which). It completely omits commit E. The resulting graph is:

  B   <-- master
 / \
A   C'-D'   <-- branch
 \
  D   <-- br2

(the original C and E are only available through reflogs and ORIG_HEAD). Moving master in a fast-forward fashion, the tip of master becomes commit D'. The contents of commit D' are determined by adding the changes extracted from C and D to B.

Since we used an "evil merge" to make changes in E that appear in neither C nor D, those changes vanish.

Here is the script that creates the problem (note, it makes a temporary directory tt that it leaves in the current directory).

#! /bin/sh

fatal() {
    echo fatal: "$@" 1>&2; exit 1
}

[ -e tt ] && fatal tt already exists

mkdir tt && cd tt && git init -q || fatal failed to create tt repo

echo README > README && git add README && git commit -q -m A || fatal A
git branch branch || fatal unable to make branch
echo for master > bfile && git add bfile && git commit -q -m B || fatal B

git checkout -q -b br2 branch || fatal checkout -b br2 branch
echo file for C > cfile && git add cfile && git commit -q -m C || fatal C
git checkout -q branch || fatal checkout branch
echo file for D > dfile && git add dfile && git commit -q -m D || fatal D
git merge -q --no-commit br2 && git rm -q -f cfile && git commit -q -m E ||
    fatal E
git branch -D br2
git checkout -q master || fatal checkout master

echo merging branch
git merge --no-edit branch || fatal merge failed
echo result is: *

echo removing merge, replacing with rebase of branch onto master
git reset -q --hard HEAD^ || fatal reset failed
git checkout -q branch || fatal switch back to master failed
git rebase master || fatal rebase failed
echo result is: *

echo removing rebase as well so you can poke around
git reset --hard ORIG_HEAD

Thanks for the explanation! It's pretty clear. But can you clarify what do you mean by "evil merge"? As I understand, judging by your answer, it is a merge with a conflict, which was resolved manually, and during the resolving of the conflict a user added some code to the merge commit. So, when we resolve conflicts by adding a code that is not a part of one of merged branches, we do an evil merge. Correct? — Victor Dombrovsky, Jun 15 '17 at 07:07
@VictorDombrovsky: Yes; the "evil merge" phrase above is also a link, which if you click on it, takes you to a StackOverflow question about it. :-) (I see only the first occurrence of that string is a link; I'll make them both link to it) — torek, Jun 15 '17 at 14:37

How rebase result may differ from result of a merge?

1 Answers1

Linked