12

For git cherry-pick resulting in a conflict, why does Git suggest more changes than just from the given commit?

Example:

-bash-4.2$ git init
Initialized empty Git repository in /home/pfusik/cp-so/.git/
-bash-4.2$ echo one >f
-bash-4.2$ git add f
-bash-4.2$ git commit -m "one"
[master (root-commit) d65bcac] one
 1 file changed, 1 insertion(+)
 create mode 100644 f
-bash-4.2$ git checkout -b foo
Switched to a new branch 'foo'
-bash-4.2$ echo two >>f
-bash-4.2$ git commit -a -m "two"
[foo 346ce5e] two
 1 file changed, 1 insertion(+)
-bash-4.2$ echo three >>f
-bash-4.2$ git commit -a -m "three"
[foo 4d4f9b0] three
 1 file changed, 1 insertion(+)
-bash-4.2$ echo four >>f
-bash-4.2$ git commit -a -m "four"
[foo ba0da6f] four
 1 file changed, 1 insertion(+)
-bash-4.2$ echo five >>f
-bash-4.2$ git commit -a -m "five"
[foo 0326e2e] five
 1 file changed, 1 insertion(+)
-bash-4.2$ git checkout master
Switched to branch 'master'
-bash-4.2$ git cherry-pick 0326e2e
error: could not apply 0326e2e... five
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'
-bash-4.2$ cat f
one
<<<<<<< HEAD
=======
two
three
four
five
>>>>>>> 0326e2e... five

I was expecting just the "five" line between the conflict markers. Can I switch Git to my expected behavior?

0xF
  • 3,214
  • 1
  • 25
  • 29

2 Answers2

13

Before we get any further, let's draw the commit graph:

A   <-- master (HEAD)
 \
  B--C--D--E   <-- foo

or, so that you can compare, here's the way Git draws it:

$ git log --all --decorate --oneline --graph
* 7c29363 (foo) five
* a35e919 four
* ee70402 three
* 9a179e6 two
* d443a2a (HEAD -> master) one

(Note that I turned your question into a command sequence, which I've appended; my commit hashes are of course different from yours.)

Cherry-pick is a peculiar form of merge

The reason you see the somewhat pathological behavior here is that git cherry-pick is actually performing a merge operation. The oddest part about this is the chosen merge base.

A normal merge

For a normal merge, you check out some commit (by checking out some branch which checks out the tip commit of that branch) and run git merge other. Git locates the commit specified by other, then uses the commit graph to locate the merge base, which is often pretty obvious from the graph. For instance when the graph looks like this:

          o--o--L   <-- ours (HEAD)
         /
...--o--B
         \
          o--o--R   <-- theirs

the merge base is simply commit B (for base).

To do the merge, Git then makes two git diffs, one from the merge base to our local commit L on the left, and their commit R on the right (sometimes called the remote commit). That is:

git diff --find-renames B L   # find what we did on our branch
git diff --find-renames B R   # find what they did on theirs

Git can then combine these changes, applying the combined changes to B, to make a new merge commit whose first parent is L and second parent is R. That final merge commit is a merge commit, which uses the word "merge" as an adjective. We often just call it a merge, which uses the word "merge" as a noun.

To get this merge-as-a-noun, though, Git had to run the merge machinery, to combine two sets of diffs. This is the process of merging, using the word "merge" as a verb.

A cherry-pick merge

To do a cherry-pick, Git runs the merge machinery—the merge as a verb, as I like to put it—but picks out a peculiar merge base. The merge base of the cherry-pick is simply the parent of the commit being cherry-picked.

In your case, you're cherry-picking commit E. So Git is merging (verb) with commit D as the merge base, commit A as the left/local L commit, and commit E as the right-side R commit. Git generates the internal equivalent of two diff listings:

git diff --find-renames D A   # what we did
git diff --find-renames D E   # what they did

What we did was to delete four lines: the ones reading two, three, and four. What they did was to add one line: the one reading five.

Using merge.conflictStyle

This all becomes somewhat clearer—well, maybe somewhat clearer —if we set merge.conflictStyle to diff3. Now instead of just showing us the ours and theirs sections surrounded by <<<<<<< etc., Git adds the merge base version as well, marked with |||||||:

one
<<<<<<< HEAD
||||||| parent of 7c29363... five
two
three
four
=======
two
three
four
five
>>>>>>> 7c29363... five

We now see that Git claims that we deleted three lines from the base, while they kept those three lines and added a fourth.

Of course, we need to understand that the merge base here was the parent of commit E, which is if anything "ahead of" our current commit A. It's not really true that we deleted three lines. In fact, we never had the three lines in the first place. We just have to deal with Git showing things as if we had deleted some lines.

Appendix: script to generate the clash

#! /bin/sh
set -e
mkdir t
cd t
git init
echo one >f
git add f
git commit -m "one"
git checkout -b foo
echo two >>f
git commit -a -m "two"
echo three >>f
git commit -a -m "three"
echo four >>f
git commit -a -m "four"
echo five >>f
git commit -a -m "five"
git checkout master
git cherry-pick foo
torek
  • 448,244
  • 59
  • 642
  • 775
  • Very clear. Any suggestion about how to deal with this situation in Meld (which I have configured with `meld "$LOCAL" "$MERGED" "$REMOTE" --output "$MERGED"` after [this answer](https://stackoverflow.com/a/34119867/3127111))? It is showing extra changes already in the middle pane (which is `$MERGED`). – watery Apr 05 '23 at 14:04
1

Well... it seems to me this is an edge case, living at the intersection of "cherry-picking an insertion that was made in the source branch at a line that doesn't exist in the target branch", and "progression of changes that all are deemed one 'hunk' because each is adjacent to the one before".

These issues make git unsure what you intend, so it asks "what should I do" in the broadest way, giving you all the code you might need to make a correct decision.

If you init the file as

a
b
c
d
e
f
g
h
i

and then create changes like

x -- A <--(master)
 \
  C -- E -- G <--(branch)

(where commit labeled A uppercases the corresponding letter in the file, etc.), this is a case that will behave much more like you expect - because everything looks very clear to git and it just does the obvious thing when you say to cherry-pick E to master. (Not only does it not stick unrelated changes into conflict markers, but there are no conflict markers; it just works.)

Now to be clear - I'm not saying cherry-pick will only do the right thing in such clear-cut cases as mine. But just as your test case is kind of a "worst case" scenario for a cherry-pick, mine is an ideal case. There are many cases in between, and in practice people usually seem to get the results they want out of cherry-pick. (In case that sounds like an endorsement of the cherry-pick command, let me clarify: I think it's the most over-recommended command in the suite, and I never really use it. But it does work, within its defined boundaries.)

Basically remember two things:

1) If you insert a new line on the source branch, and a corresponding insertion point can't be unambiguously determined in the target branch, you're going to get a conflict and it will probably include "context" from the source branch

2) If two changes are adjacent - even though they don't overlap - they can still be considered in conflict.

Mark Adelsberger
  • 42,148
  • 4
  • 35
  • 52