I have failed to find any article/toturial which explain what exactly happands when one try to cherry-pick
multiple commits.
In this case, the cherry-pick code uses Git's sequencer, which is also used for git am
and git revert
(and in very recent versions of Git, some cases of git rebase
—git rebase
is, as you read, partly implemented using git cherry-pick
, although it's also partly implemented using git am
: which one you get depends on flags you supply to git rebase
). Note that internally, git revert
and git cherry-pick
are the same command (built from builtin/revert.c
).
The sequencer simply runs repeated "one-commit-at-a-time" Git sub-commands on a sequence of commits, with the option of skipping any individual commit if the single-shot command fails. The individual commit hash IDs are often—though not always—gathered by running git rev-list
. So the first part of your "2." and "3." questions can be found by running git rev-list
(though the result is not particularly useful to humans :-) since git rev-list
is meant to produce output useful to other Git commands instead).
So, let's take these in order:
What CHERRY_PICK_HEAD ref is?
When the sequencer is run on one commit for cherry-pick or revert, it notices, writes the commit ID to CHERRY_PICK_HEAD
or REVERT_HEAD
, and invokes the code to do a single pick/revert. (Follow the link to the actual Git source on GitHub for further details.) Otherwise, it does the rev-list walk to build the list of commits, writes them to the sequencer directory (or immediately fails and rejects your attempt if there's an ongoing sequenced operation), and then does one cherry-pick or revert at a time. This calls do_pick_commit(), which is a fairly complicated function, but you can see that at line 1118, it also will write the current commit's hash ID to CHERRY_PICK_HEAD
, if we are cherry-picking and we are going to stop for some reason.
Hence, whenever any individual cherry-pick fails and stops with an unmerged index, or stops after success due to the use of --no-commit
, CHERRY_PICK_HEAD
contains the hash ID of the commit that was being picked at the time the command stopped.
You can then resolve the problem and run git cherry-pick --continue
. This particular invocation checks for the existence of the sequencer directory; if it's there, it assumes you have fixed the problem and attempts to continue the existing, on-going cherry-pick sequence.
2--3--4 <-- dev
/
1
\
5--6--7 <-- master (HEAD)
By running git cherry-pick 2^..4
, what is the sequence of actions git does and exactly between which commits git use diff
?
If you run:
git rev-list 2^..4
(replacing 2
and 4
with actual hash IDs, or using the name dev
to identify commit 4) you'll see that this lists the hash IDs of 4, then 3, then 2 (in that order). When doing git cherry-pick
, though, Git specifically uses a reversed order on each ".."-style selection, so that the actual commit hashes are those for 2, then 3, then 4.
The sequencer therefore writes those three hash IDs into the sequencing area, and then runs do_pick_commit
on each one. Looking closely starting at lines 1043 and again at 1088, you can see that it's actually a little bit misleading to say that Git runs a diff between the parent and child of each commit. In fact, it runs a merge operation ("merge as a verb", as I like to put it), with the merge base being the parent of each commit and the to-be-merged commit as the --theirs
commit. (The --ours
commit is, as always, the current or HEAD
commit.)
However, the merge operation itself does, in effect, run git diff
between the merge base and each of the two branch tips. Since the merge base is the parent of the commit being cherry-picked, this diffs 2^
(or 1
) vs 2
as the input to the --theirs
side. It also diffs 2^
vs HEAD
as the input to the --ours
side, and then does a merge.
By default (without -n
/ --no-commit
), Git will commit the result of this merge, if it succeeds, as a single-parent, non-merge commit. So while this particular cherry-pick performs a merge, it makes an ordinary commit. The commit message for this new commit is a copy of the commit message from the original commit, i.e., a copy of the message from commit 2 (plus a line added holding the original commit hash if you ask for that using -x
).
If all has gone well, the sequencer moves on to commit 3. The parent of 3 is 2, so the sequencer invokes the merge machinery to merge commit 3 and the (newly created by previous step) HEAD using commit 2 as the merge base this time. This means that Git will diff 2 vs 3, and also 2 vs HEAD, combine the diffs, and if all goes well, make a new ordinary (non-merge) commit that becomes HEAD.
If that goes well, the sequencer moves on to commit 4, which behaves in the same way.
The final result is:
2--3--4 <-- dev
/
1
\
5--6--7--2'-3'-4' <-- master (HEAD)
where 2'
is a sort of copy of 2
, 3'
is a sort of copy of 3
, and 4'
is a sort of copy of 4
.
2---3---4
/ \
1--5--6--7--8 <-- dev
\
9 <-- master (HEAD)
By running git cherry-pick 1..8
, what git will do?
Here we are going to have multiple problems.
First, cherry-pick
invokes the sequencer code since you've specified a range of commits, 2^..8
. This particular sub-range gets reversed:
git rev-list --reverse 2^..8
This lists commits 2, 3, 4, 5, 6, 7, and 8 in some order, but what exactly is the order? We've asked for all commits reachable from commit 8 (including 8 itself), excluding all commits reachable from commit 2^
(i.e., commit 1). Certainly, without --reverse
, we'd see commit 8 first, which means that with --reverse
we will see commit 8 last. But 8 has two parents, namely 4 and 7. Git could pick either one here.
Without --topo-order
, Git chooses the one with the most recent time-stamp first. Suppose that the two time stamps make it pick 7 first. We'll get 8, then 7 (so that after reversing we'll have 7, then 8, at the end). Now there are again two commits to choose next: 6, and 4. Suppose that the two time stamps make Git pick 4 next. We'll now have two commits to choose from: 6, and 3. This process repeats until the two legs in the graph re-converge at commit 1 (which we won't pick anyway).
The --reverse
means that we get a linearized list ending with commit 8, but the order of 2, 3, 4, 5, 6, and 7 is determined by time stamps (specifically commit time stamps, not author time stamps). So it's not very easy, without looking at the commit time stamps or running git rev-list
, to know which order each individual commit will be cherry-picked.
In any case, the sequencer will still cherry-pick each commit one at a time, in whatever order they come out of git rev-list --reverse
. But eventually we'll cherry-pick all of 2/3/4/5/6/7, which are not merges, and then cherry-pick commit 8, which is a merge. Either way, we'll go through the code at lines 967–990. For the commits that are not merges, git cherry-pick
will demand that we have not supplied the -m
option. For the commit that is a merge—commit 8—git cherry-pick
will demand that we do supply the -m
option.
So this cherry-pick is guaranteed to fail. To make it work well, you must avoid cherry-picking the merge, and you should cherry-pick each individual range 2^..4
and 5^..7
(in either order).