9

I am trying to understand what merge and rebase do, in terms of set operations in math.

In the following, "-" means diff (similar to taking set difference in math, but "A-B" means those in A but not in B and minus those in B not in A), and "+" means patch (i.e. taking disjoint union in math. I haven't used patch before, so I am not sure).

From Version Control with Git, by Loeliger, 2ed

  1. The command git cherry-pick commit applies the changes introduced by the named commit on the current branch. It will introduce a new, distinct commit. Strictly speaking, using git cherry-pick doesn’t alter the existing history within a repository; instead, it adds to the history.

    enter image description here

    enter image description here

    Is it correct that F' = (F-B) + Z?

  2. The git revert commit command is substantially similar to the command git cherry-pick commit with one important difference: it applies the inverse of the given commit. Thus, this command is used to introduce a new commit that reverses the effects of a given commit.

    enter image description here

    enter image description here

    Is it correct that D' = G - D?

Patrick Lee
  • 1,990
  • 1
  • 19
  • 24
Tim
  • 1
  • 141
  • 372
  • 590
  • You have great questions, you should watch this series: http://shop.oreilly.com/product/0636920024774.do will teach you a lot – CodeWizard Jan 03 '16 at 00:47

3 Answers3

9

cherry-pick

Is it correct that F' = (F-B) + Z?

No, that would also introduce the changes that were introduced in C, D and E.

git-cherry-pick works by isolating the unique changes in the commit to be cherry-picked (ie, F-E in this example, ignoring additional ancestors including the merge base), and apply them to the target.

This is not done with patch application, but by using the three way merge algorithm - the parent of the commit to be cherry-picked will be used as the common ancestor, and the commit to be cherry-picked will be one side of the merge, with the target as the other side. The product of this is the changes that were included in the cherry-picked commit and in the target.

For example, if E is the parent of the commit to be cherry-picked, and its contents (acting as the common ancestor) are:

Line 1
Line 2
Line 3
Line 4
Line 5

For example, if F is the commit to be cherry-picked, and its contents are:

Line 1
Line 2
Line Three
Line 4
Line 5

And the target of the cherry-pick Z is:

LINE 1
Line 2
Line 3
Line 4
Line 5!

Then the results of a three-way merge are (with annotations about where each line came from):

LINE 1
Line 2
Line Three
Line 4
Line 5!

revert

Is it correct that D' = G - D?

Yes, roughly speaking. The changes that were unique to D have been removed from G. Like git-cherry-pick, git-revert is implemented using a three-way merge, though this time the commit to revert is treated as the common ancestor, one side is the current commit and the other side is the commit to revert's parent.

This will mean that when a line is identical between the commit to revert and the current commit, the line from its parent will be chosen instead.

If the contents of D, the commit to revert is acting as the common ancestor, and its contents are:

Line 1
Line 2
Line THREE
Line 4
Line FIVE

And the contents of C (D's parent) are:

Line 1
Line 2
Line 3
Line 4
Line 5

And the contents of G has been changed further, and its contents are:

Line One
Line 2
Line THREE
Line 4
Line FIVE

Then the results of the three-way merge will be:

Line One
Line 2
Line 3
Line 4
Line 5

Which is the result of taking the unique lines in the parent C and the target G.

Merge Commits

As torek notes (below), since these mechanisms both involve using a parent commit, these break down when there are more than one parent commit. (Ie, the commit in question is a merge and has multiple parents.) In this case, you will need to specify to git which parent to consider (using the -m flag).

Conflicts

Of course, either of these mechanisms may cause conflicts. For example, if the current conflict had further changed then you will have to resolve conflicts. For example, if in the revert example (above), a subsequent commit had also changed line 5, so G had actually been:

Line One
Line 2
Line THREE
Line 4
LINE FIVE!

Then there would be a conflict. The working directory (merged file) would be:

Line One
Line 2
Line 3
Line 4
<<<<<<<
LINE FIVE!
=======
Line 5
>>>>>>>

And you will need to decide whether you want the original change (Line 5) or the newest change (LINE FIVE!).

Community
  • 1
  • 1
Edward Thomson
  • 74,857
  • 14
  • 158
  • 187
  • 1
    It's also worth adding here that you can't cherry-pick or revert a merge commit *unless* you tell git which of the multiple parent commit(s) to use as the (pretended single) predecessor node. When picking `E` or reverting `D`, you don't need to do this as they already have only one predecessor node. – torek Jan 03 '16 at 05:38
  • @torek Very much so. I had hoped I could slide that distinction under the radar. :) I updated my answer to attempt to clarify this. – Edward Thomson Jan 03 '16 at 05:49
  • 1
    This is a real awesome answer! – Samuel Aug 14 '17 at 00:39
  • But what if Line 6 was added in Commit D. Would that get reverted since Commit G would also have that line inherited and Line 6 is not some identical line which will be found in Commit C? The same also goes while cherry-picking Commit F, if some Line 6 was added in Common Ancestor E. – Ruraloville Aug 20 '19 at 13:50
3

Its very simple to understand it like this:

cherry-pick

choose which commits (from any branch or even can be loose commit) pick this commit and place it in my current branch, in other words - take any commit from anywhere in the repository add bring it to my branch


revert

Undo any commit. it will "revert" any changes made in commit by undoing them, if you know what is patch so you can see it as reversing the sign in the patch - becommig + and vice versa. your changes are being "reverted" and the changes are being undone.

The git revert command undoes a committed snapshot.

But, instead of removing the commit from the project history,
it figures out how to undo the changes introduced by the commit and appends a new commit with the resulting content.

This prevents Git from losing history, which is important for the integrity of your revision history and for reliable collaboration


Is it correct that F' = (F-B) + Z?

It simply mean that now in the lower branch you also have the patch that was created in commit F, your lower branch contains its changes + the changes which was made in commit F (and only them no other commits beside F)


Is it correct that D' = G - D?

Not exactly - it means that now you have commit D and after few commits you have the undo of that commit, in the repository you still have the 2 commits but the code will be unchanged (change + undo on 2 separate commits)

CodeWizard
  • 128,036
  • 21
  • 144
  • 167
  • Thanks. I am still not clear what F' and D' consist of and how they are created from existing commits. (1) In the cherry-pick operation, Are common ancestors of F and Z, e.g. B, involved? (2) In revert operation, how is G involved? – Tim Jan 03 '16 at 02:48
  • Addressing your 2 Q: D' is the "undo" of D which mean that any change that was done in D is now undone in D' - you have 2 commits. one is the original one and the second is the undo of this commit. your code is back to its state before commit D but you have 2 commits. D= the changes D'=undo the changes. makes sense to you now? – CodeWizard Jan 03 '16 at 02:52
  • In revert operation, how is G involved? G is not evolved, its the last commit from figure 10-8, D' is simply commited after it (figure 10-9) – CodeWizard Jan 03 '16 at 02:53
  • Thanks. But still no. – Tim Jan 03 '16 at 02:56
  • What else do you not understand? – CodeWizard Jan 03 '16 at 03:03
  • (revert) Look at it this way: you committed something to your repository and after X commits ahead you "regret" and wish to remove that commit. you simply use `git revert ` and what git does is "undo" all the changes made in this commit. makes more sense? – CodeWizard Jan 03 '16 at 03:07
  • The git revert command undoes a committed snapshot. But, instead of removing the commit from the project history, it figures out how to undo the changes introduced by the commit and appends a new commit with the resulting content. This prevents Git from losing history, which is important for the integrity of your revision history and for reliable collaboration – CodeWizard Jan 03 '16 at 03:10
2

With Git 2.29 (Q4 2020) addresses a similar situation

See commit 087c616, commit 409f066, commit 5065ce4 (20 Sep 2020) by brian m. carlson (bk2204).
(Merged by Junio C Hamano -- gitster -- in commit c5a8f1e, 29 Sep 2020)

docs: explain why reverts are not always applied on merge

Signed-off-by: brian m. carlson

A common scenario is for a user to apply a change to one branch and cherry-pick it into another, then later revert it in the first branch. This results in the change being present when the two branches are merged, which is confusing to many users.

We already have documentation for how this works in git merge(man), but it is clear from the frequency with which this is asked that it's hard to grasp.
We also don't explain to users that they are better off doing a rebase in this case, which will do what they intended.
Let's add an entry to the FAQ telling users what's happening and advising them to use rebase here.

gitfaq now includes in its man page:

If I make a change on two branches but revert it on one, why does the merge of those branches include the change?

By default, when Git does a merge, it uses a strategy called the recursive strategy, which does a fancy three-way merge.
In such a case, when Git performs the merge, it considers exactly three points: the two heads and a third point, called the merge base, which is usually the common ancestor of those commits.
Git does not consider the history or the individual commits that have happened on those branches at all.

As a result, if both sides have a change and one side has reverted that change, the result is to include the change.
This is because the code has changed on one side and there is no net change on the other, and in this scenario, Git adopts the change.

If this is a problem for you, you can do a rebase instead, rebasing the branch with the revert onto the other branch.
A rebase in this scenario will revert the change, because a rebase applies each individual commit, including the revert.
Note that rebases rewrite history, so you should avoid rebasing published branches unless you're sure you're comfortable with that.
See the NOTES section in git rebase for more details.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250