2

For example, I have the following:

A--B (master)
 \
  C--D (feature)

If I do git rebase master feature git takes diff between commits A and C and applies it on top of the commit B, then it takes diff between commits C and D and applies it on top of new commit C'. Git calls it replaying changes. Now if I merge feature into master what process does actually happen? My assumption based on what I've read is that git finds common ancestor, A in this case`, then:

  1. takes diff between last commit on branch feature, here between A and D
  2. takes diff between last commit on branch master, here between A and B
  3. applies diff from step 1 and step 2 to common ancestor

Is this correct? I've decided to confirm that because in git documentation replaying changes also is mention when referring to merge operation.

Max Koretskyi
  • 101,079
  • 60
  • 333
  • 488

2 Answers2

2

The most common algorithm used by Git is the recursive three-way merge. The three-way part of the merge refers to the two branch heads (B and D) and their ancestor (A).

First, it determines the common ancestor. In your example this is easy, but with a lot of merging between branches it can be non-trivial. If there are several candidate ancestors, it will perform a virtual merge between them and their candidate ancestor(s) and use that merge as a virtual ancestor. If the ancestor for the candidates cannot be resolved, it'll do another merge of their ancestors... and so on until it finds one ancestor. This is the "recursive" part. Part of the reason to update feature branches with rebase is to keep the branch ancestry simple.

The three-way merge looks for sections which are the same or different in the ancestor and two branches (A, B and D).

  • If all three agree, A is used (or B or D).
  • If only B and D agree, the two branches made the same change, B or D is output.
  • If only the ancestor (A) and one branch agree, the other branch made a change.
    • If only B and A agree, D (feature) made a change, D is output.
    • If only D and A agree, B (master) made a change, B is output.
  • If all are different, there is a conflict.

Git has the extra ability to recognize heuristically when a file has been renamed or copied. So if feature renamed foo to bar and made small changes, Git can often recognize that feature's foo is master's bar and merge correctly.

How diff works is by solving the longest common subsequence problem. If you want a lot of detail, here is a formal study of how diff3 works.

But Git has multiple merge strategies and will pick what it thinks is the best one. If you think it guessed wrong, usually because there's a ton of conflicts, you can tell it which one to use and configure it with -s and -X. You can read more about these strategies in the git-merge man page.

Here's some resources about what the strategies are and when to use them.

Schwern
  • 153,029
  • 25
  • 195
  • 336
2

That's correct: git gets a diff from the merge-base (A) to each tip and merges them.

It's worth noting that the branch you're on when you issue the git merge command is always the first parent of the resulting merge commit, and the branch-tip you ask it to merge is the second; and if the merge stops with a conflict, the --ours and --theirs flags to git checkout refer to the current, and to-be-merged-in, commits respectively.


During a git rebase operation git gets onto a "detached HEAD" on the target of the rebase, and then (in essence for plain rebase, but literally true for interactive rebase) cherry-picks each commit into a new branch it forms as it goes. If the cherry-pick results in a merge conflict, git assigns --ours to the commit you're on—the detached HEAD—and --theirs to the commit you're cherry-picking, which is the one in what you're rebasing; so in this case the sense of "ours" and "theirs" feels reversed.

Since rebase does as many cherry-picks as you have commits to re-base, you can get merge conflicts, resolve them, continue, and get another different set of merge conflict—or in some cases, the same merge conflicts (in this case setting rerere.enabled may be a good idea).

Once a rebase finishes, git adjusts the branch reference to point to the tip of the newly built (and no longer "detached") HEAD.

torek
  • 448,244
  • 59
  • 642
  • 775