3

Context

git merge considers the setting merge.conflictStyle in case of merge conflicts. Possible values are merge (default) and diff3.

I noticed that diff3 sometimes produces much bigger conflicts (see example below). I found this paper, which describes the diff3 algorithm in great detail, but I couldn't find much about the default merge algorithm.

Question

What are the exact differences between the merge and diff3 algorithm? How does the default merge algorithm work exactly?

Example

I have these files:

  • Base:
1
2
3
  • Yours:
1
change1
change2
input1OnlyChange1
change3
change4
change5
change6
input1OnlyChange2
change7
change8
change9
2
3
  • Theirs:
1
change1
change2
input2OnlyChange1
change3
change4
change5
change6
input2OnlyChange2
change7
change8
change9
2
3

With merge I get 2 conflict markers:

1
change1
change2
<<<<<<< HEAD
input1OnlyChange1
=======
input2OnlyChange1
>>>>>>> input2
change3
change4
change5
change6
<<<<<<< HEAD
input1OnlyChange2
=======
input2OnlyChange2
>>>>>>> input2
change7
change8
change9
2
3

However, with diff3 I only get 1 conflict marker:

1
<<<<<<< HEAD
change1
change2
input1OnlyChange1
change3
change4
change5
change6
input1OnlyChange2
change7
change8
change9
||||||| 0fcee2c
=======
change1
change2
input2OnlyChange1
change3
change4
change5
change6
input2OnlyChange2
change7
change8
change9
>>>>>>> input2
2
3

This is my test script (powershell):

rm -Force -r ./repo -ErrorAction Ignore
mkdir ./repo
cd ./repo
git init

# git config merge.conflictStyle diff3

cp ../../base.txt content.txt
git add *; git commit -m first

git branch base

git checkout -b input2
cp ../../input2.txt content.txt
git add *; git commit -m input2

git checkout base
cp ../../input1.txt content.txt
git add *; git commit -m input1

git merge input2

Does the merge algorithm diff the diffs again to split up the bigger conflict? Clearly the merge algorithm also performs some kind of 3 way diff, as you don't get a conflict when you update base to match yours.

Official documentation

The docs say this:

Specify the style in which conflicted hunks are written out to working tree files upon merge. The default is "merge", which shows a <<<<<<< conflict marker, changes made by one side, a ======= marker, changes made by the other side, and then a >>>>>>> marker. An alternate style, "diff3", adds a ||||||| marker and the original text before the ======= marker.

Clearly this does not explain the observed difference in the example.

Henning
  • 579
  • 6
  • 17
  • 1
    I've noticed this too, in just this sort of situation, ie a base with nothing and two branches with large slightly different things. Ironically I then have to solve it by using old fashioned direct diff between our and theirs. – matt Aug 12 '22 at 11:54
  • 1
    However, it seems obvious why this is: in the diff3 style, both ours and theirs have to be contrasted with the original nothing so the whole addition is a hunk. – matt Aug 12 '22 at 11:58
  • I found this old unsatisfying answer: https://stackoverflow.com/a/17393240/11045512 Background of this question is our work on the new VS Code merge editor, which implements more or less diff3. Unfortunately, many users get much bigger conflicts now with the new merge editor than with the previous conflict markers generated by git. – Henning Aug 12 '22 at 12:01

2 Answers2

5

Yes, this arises particularly when both sides added something where there was nothing before, but they added different things (hence the conflict, obviously).

Clearly this does not explain the observed difference in the example

Actually, I think it does. In the two-part merge conflict display style, we just contrast ours against theirs, so regions of identical content are not shown as part of the conflict. But in the three-part diff3 merge conflict display style, we display the conflict by diffing ours against base and theirs against base; in a case where base is "nothing", as here, that means that both the ours display hunk and the theirs display hunk must consist of the entire inserted material.

From a practical point of view, this makes the conflict a lot harder for a human to solve when viewed as diff3 — and in actual fact, what I do is re-diff it the other way, diffing the ours hunk against the theirs hunk to help me "spot the difference" that needs thinking about. You can swap display styles in the middle of the conflict by saying git checkout --conflict <diff3|merge> <filepath>.


Addendum Consideration of your comments leads me to suggest you may have a possible misunderstanding here. The merge/diff3 distinction doesn't affect how the merge works or whether there is a conflict. What it affects, given that there is a conflict, is how it is displayed in the single-file markup.

matt
  • 515,959
  • 87
  • 875
  • 1,141
  • > In the two-part merge style, we just diff ours against theirs, so regions of identical content are not part of the conflict. Unfortunately, I think it is not that simple, as the base also plays a role in the "two-part" merge style. To see that, in the extreme case just use theirs (or yours) as base and then there are no conflicts at all. Thus even the "two-part" merge uses some kind of three way merge. – Henning Aug 12 '22 at 12:08
  • 1
    I didn't say a merge doesn't use the base. I know how a merge works. I'm talking about the _conflict_ and how it is _displayed._ The merge / diff3 comparison doesn't change anything about the conflict; there's a conflict either way. It just changes how the markup is printed in the conflicted document. – matt Aug 12 '22 at 12:09
  • I reworded the answer to speak a bit more clearly. I didn't realized that you thought merge / diff3 made a difference to the internal workings of the _merge_. It doesn't. These are not merge strategies; they are _display_ formats, i.e. techniques for showing the user the conflict in the artificially marked up single file. If you want to know what's really going _on_, i.e. what conditions the automerge choked on, you look, not at that marked up file, but at the :1:, :2:, and :3: versions of the files in the index. – matt Aug 12 '22 at 12:14
  • Thanks for your clarification! Does this mean that we can "easily" derive the conflictStyle:merge result from a merged file that was generated using diff3, by refining each conflict by diffing the yours section with the theirs section? – Henning Aug 12 '22 at 12:17
  • 1
    That, as I said in my very first comment (on your question), is what I do in fact in this particular situation. It's ironic but not problematic. The diff3 display style is just not as helpful to the poor old human eye in this particular situation, that's all. — It's funny you should come along with this just now, as this just happened to me yesterday. – matt Aug 12 '22 at 12:19
  • 1
    Note that you can switch display styles in the middle of the horse (sorry, that metaphor broke down): `git checkout --conflict `. I'll add that to my answer. There is a third style `zdiff3` and perhaps it does better here but I have not tried it. – matt Aug 12 '22 at 12:25
  • Thanks, I think I now have a much better understanding of that setting! – Henning Aug 12 '22 at 12:33
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/247223/discussion-between-henning-and-matt). – Henning Aug 12 '22 at 12:34
3

This is mostly an addendum to matt's answer.

Remember that git merge works by comparing two versions of some file against a common starting "merge base" version of that same file. That is, we have three, not two, inputs, and in pseudo-code we do the following:

tip1=$(git rev-parse HEAD)
tip2=$(git rev-parse "$merge_argument")
base=$(git merge-base --all $tip1 $tip2)

# make sure there's only one "base", by whatever means;
# this code is omitted as it's complicated.

git show $base:$path > tmp.base
git show $tip1:$path > tmp.tip1
git show $tip2:$path > tmp.tip2

diff tmp.base tmp.tip1   # figure out what "we" changed in --ours
diff tmp.base tmp.tip2   # figure out what they changed in --theirs

# combine the changes (code not shown)
# apply the combined changes to tmp.base
# put tmp.base into place as the merge result, perhaps with conflicts

In all cases, git merge has detected a conflict because the two diffs have produced overlapping or abutting (but not identical) diff hunks.

With the diff3 setting, Git:

  • places the tip1 contents at the top of the block after <<<<<<< HEAD
  • places the original-file (tmp.base) contents from the entire conflicted range in the middle, after ||||||| ID
  • adds =======
  • places the tip2 contents at the bottom of the block before >>>>>>> ID

(and all of this sits in the middle of the parts of the file that were resolved without conflict). The two IDs are hash IDs or strings (the method by which Git inserts a string in place of a hash ID is particularly disgustingly hacky, but that's an irrelevant implementation detail unless you want to run git-merge-recursive or git-merge-ort "raw" / "by hand").

With the merge conflict style, however, Git takes the conflicted part and does what it can to merge common parts. This results in moving the <<<<<<< HEAD line "down" and the >>>>>>> ID line "up". It can even split the conflict into multiple smaller conflicts, each of which gets its own separate marker. This makes the merge conflict "look smaller", which humans often find helpful (though I have personally found it confusing at times as well!).

As matt mentioned in a comment, there is a new option, zdiff3, that first appeared in Git 2.35 (see commit 4496526f80b3e4952036550b279eff8d1babd60a. The z here is for "zealous": this variant of diff3 attempts to do the same sort of merge-conflict-shrinkage that plain merge does, but without splitting the conflict. That is, it will only "move up" and "move down" across identical changes, shrinking the merge base section, but preserving it as a single chunk. The regular merge style could, as noted earlier, break it into multiple parts if that would shrink the conflict further; zdiff3 will not.

The commit message has a nice example in it, so click on over to the commit on GitHub to see how zdiff3 presents the given conflict.

torek
  • 448,244
  • 59
  • 642
  • 775
  • 1
    That commit comment includes the phrase "shows the base version too and the base version cannot be reasonably split" which is _exactly_ what my answer was trying to convey. Nice pointer. – matt Aug 13 '22 at 04:07
  • Here is an example where the merge conflict style is quite confusing (see conflict markers on the right side): https://user-images.githubusercontent.com/2931520/185163975-114d4091-30fc-4c08-8980-03cf95ac59f8.png Though the diff on the left side also could do better. – Henning Aug 17 '22 at 16:09