Confusions about the merge/rebase step of git pull

Question

From Version Control with Git by Loeliger 2ed, about the merge or rebase step in git pull:

In the second step of the pull operation, Git performs a merge (the default), or a rebase operation.

About the merge step in git pull:

In this example, Git merges the contents of the remote-tracking branch, origin/master, into your local-tracking branch, master, using a special type of merge called a fast-forward.

But how did Git know to merge those particular branches? The answer comes from the configuration file:
[branch "master"]
        remote = origin
        merge = refs/heads/master
Paraphrased, this gives Git two key pieces of information: When master is the current, checked out branch, use origin as the default remote from which to fetch updates during a fetch (or pull). Further, during the merge step of git pull, use refs/heads/master from the remote as the default branch to merge into this, the master branch.

Generally speaking, does the merge step merge the remote tracking branch into master or the current branch? My guess is the current branch, i.e. the branch pointed by HEAD, and it is not necessarily master.

Note my guess is from https://git-scm.com/docs/git-pull

Incorporates changes from a remote repository into the current branch. In its default mode, git pull is shorthand for git fetch followed by git merge FETCH_HEAD.

More precisely, git pull runs git fetch with the given parameters and calls git merge to merge the retrieved branch heads into the current branch.
Does git pull merge the remote tracking branch always into the current branch? (My guess is yes). if not, is there an argument to git pull that specifies the target branch in the merging step? If I am correct, the refspec argument to git pull doesn't specify the target branch for merging step.
why is the merge a "fast-forward merge"?

About the rebase step in git pull:

The command git pull --rebase will cause Git to rebase (rather than merge) your local-tracking branch onto the remote-tracking branch during only this pull.

Is it correct that the rebase step of git pull rebase the current branch (i.e. the branch pointed by HEAD) onto the remote tracking branch? If yes, why does the quote says "your local-tracking branch" instead the current branch?

Kaz · Answer 1 · 2015-12-29T03:25:05.343

Generally speaking, does the merge step merge the remote tracking branch into master or the current branch?

Those two are the same: master is your current branch. The remote tracking branch is origin/master.

(Where origin is just the default name used for a remote, not any special Git "keyword", and master is the default branch. Actual names may differ in actual scenarios.)

The target of a rebase or merge operation (where the resulting changes will go) is the current branch.

Git will fight you if you try to do something silly like checkout origin/master to try to make a tracking branch current.

So, the target of the merge is master. The question is, what gets merged? If you pull some new upstream material, you may end up with (for example) this situation:

     YOURS (master)             
             *
              \      UPSTREAM (origin/master)    
                *     *
                 \   /
                   *      <--- "git merge-base master origin/master"
                   |
                   *

master and origin/master have diverged, and have 2 and 1 new commits, respectively.

With git rebase, the two local commits under YOURS get rewritten over top of UPSTREAM:

      YOURS (master)             
              *       <-- rewritten: SHA changes
               \          
ghost of YOURS  *       <-- rewritten: SHA changes
      * ---- *    \
              \     * UPSTREAM (origin/master)
               \  /
                 *
                |

Now the situation is that "local branch master is ahead of origin/master by 2 commits". You are ready to (try to) git push.

Your original unrewritten commits still exist (denoted by "ghost of YOURS" in the ASCII diagram). They are referenced in the git reflog. When they expire from there or are manually cleared, they become garbage, which is eventually garbage-collected.

With git merge you do something else: Your changes are merged against origin/master and a new commit is created which has two parents:

                  * YOURS-merged (master)
                /  \
     YOURS    /      \          
             *        |
              \       |  UPSTREAM (origin/master)    
                *     *
                 \   /
                   *      <--- "git merge-base master origin/master"
                   |
                   *

Now your master is ahead of origin/master by 1 commit. If you successfully push now, the remote repository's master will take on the above shape also, with the two-parent commit with your changes along one path, and the previous upstream commit along the other path. There is no "ghost of YOURS": your changes were never rewritten.

why is the merge a "fast-forward merge"?

A fast-forward merge or rebase happens in two scenarios. One is that origin/master doesn't have any parallel changes:

     YOURS (master)             
             *
              \          
                *
                 \  
                   *    UPSTREAM (origin/master)
                   |
                   *

In the above case there is nothing to do: everything is up-to-date and you are ahead of origin/master by two commits. Thus, this is not a "fast forward anything"; a git merge or git rebase will simply do nothing at all. You can try to git push if you are happy with those commits.

The true fast-forward scenario occurs when you don't have any local changes:

                     * UPSTREAM (origin/master) 
                    /
    YOURS (master) *
                   |
                   *

In this case if you do a git rebase or git merge, then by default, your master HEAD pointer just "slides ahead" to point to the same commit as origin/master, and this slide is the "fast forward":

        YOURS (master) *  UPSTREAM (origin/master)  
                     /
                    *  <- master slid forward from here
                    |
                    *

The "fast forward" terminology probably refers to the idea that only the HEAD pointer moves (along with a rewrite of your local filesystem tree to match). No new commits have to be written or rewritten.

A fast-forward is not possible in the diverging scenarios (unless you throw away your local changes). A new merge commit has to be produced, or a rebase has to rewrite some changes. Before these operations, there doesn't exist any commit to which master can just slide forward.

Some people use Git in a certain way where they want all their commits of local work to be merges. When there is no new work upstream to merge against, you can force Git to make a merge commit anyway with git merge --no-ff (no fast-forward). Starting with this:

     YOURS (master)             
             *
              \          
                *
                 \  
                   *    UPSTREAM (origin/master)
                   |
                   *

You get something like this:

               ----* YOURS-merge   (master)
     YOURS    /    |  
             *     |
              \    |     
                *  |
                 \ | 
                   *    UPSTREAM (origin/master)
                   |
                   *

The idea is that in the mainline path of the history, the two changes appear condensed into one step. The other parent can be followed to see the detailed lineage with the multiple commits.

why does the quote says "your local-tracking branch" instead the current branch?

This is a mistake. All branches involved are local (in your repository). There is the working branch (like master), and its remote-tracking branch (like origin/master). Both are local.

The branch where you do work isn't called a tracking branch; it doesn't track anything.

Some local branches are not paired with remote-tracking branches. All branches in a repo which has no remote (for instance something newly created by git init) are purely local. Any locally created branch is purely local unless pushed to a remote repo.

The remote tracking branch is a branch-like object which is paired with a local branch. It keeps track of what the upstream is doing. Every time you do a git fetch, it is potentially rewritten to point to some different commit. The local branch is unaffected. The two may diverge, and that is resolved by rebase or merge. When you do a git push, the local branch updates the actual remote one in the upstream repo. When this operation is successful (not rejected by the remote end), then the local tracking branch is also updated to point to that same commit, so they stay in sync.

Under normal circumstances, no purely local operation will change origin/master: it tracks the state of the corresponding branch in the remote repo.

score 1 · Answer 2 · edited May 23 '17 at 12:24

As in my other comment recently, I'd like to avoid the term "local-tracking branch" entirely and concern myself here only with whether a local branch has an upstream set, and if so, what that upstream is set to (which as the book you're quoting notes, is split into two parts, the "remote" and "merge" strings).

For your first question:

Generally speaking, does the merge step merge the remote tracking branch into master or the current branch?

It's always the current branch. The git pull command (which used to be a script but was recently rewritten in C) actually runs git merge, and git merge always starts with the current branch (and current work-tree).

What gets merged-in is a bit trickier. Again, this is a bit you quoted; I will just move the emphasis:

... git pull ... calls git merge to merge the retrieved branch heads into the current branch.

Once you're comfortable with "always merges into the current branch", the remaining key question here is: which branch heads were retrieved?

The answer is a little bit complicated, as it depends on the arguments you pass to git pull. The general form is git pull remote [options] refspec [refspec ...], i.e., the second word is the literal string pull, the third word (after parsing any options) is the name of the remote, and the fourth-and-further words are "refspecs".

You can specify a refspec, or multiple refspecs, in which case git fetch uses those to bring over some specific branch head or heads, and those are what get merged. Or, you can leave them out, in which case git pull directs git fetch to bring over the current branch's upstream.¹

This leads to a common mistake. If you have local branches A and B that are set to track origin/A and origin/B respectively, people like to enter the command git pull A B, thinking this will update origin/A and origin/B—and it will!²—but then also thinking that git would then merge origin/A into A, and origin/B into B, and it won't. Instead, it brings over the two branch heads, then runs an "octopus merge" into the current branch (whatever the current branch is).

The answer to this multi-part question, then, is:

Does git pull merge the remote tracking branch always into the current branch? (My guess is yes). if not, is there an argument to git pull that specifies the target branch in the merging step? If I am correct, the refspec argument to git pull doesn't specify the target branch for merging step.

(A) No, or at least, it's not guaranteed: it merges the retrieved branch heads (which git fetch recorded in FETCH_HEAD; and if git fetch retrieved extra heads, which it does sometimes, it marks those not-for-merge to "hide" them from the merge step), which may or may not be the same as the remote-tracking branch set as the current branch's upstream. In my experience, most users are surprised when it's not the current branch's upstream; that's usually not what they wanted.

(B) Yes: the "refspecs" you specify (if any) determine which branch heads are retrieved. Fetch refspecs are subtly different from push refspecs and I won't go into a lot of detail here (see this answer for details), but the short version is that if you leave out the colon :, these are the heads deposited into FETCH_HEAD for merging. The pull code supplies a default refspec if needed, so that there's something in FETCH_HEAD.

The answer to this question is also a bit complicated, because the question has two assumptions built in that may be wrong:

Why is the merge a "fast-forward merge"?

First, it isn't necessarily a fast-forward merge.

Second, "fast-forward merge" is something of a misnomer, or at least, a short-cut name that obscures a deeper truth. "Fast-forwarding" is really a property of a label move rather than of a merge. Git calls a merge a "fast-forward" when no actual merge is required and the simple label move can be done instead.

Remember that in git, the word "branch" has at least two meanings: there's the branch name, like master or feature or bug2.71828 or whatever, and then there's the underlying commit-graph data structure. The data structures are permanent but the names are ephemeral and can be changed.³ When you ask git to merge the branch named X into your current branch, it first resolves the name X to a commit ID. It then checks whether the current HEAD commit (the tip of your current branch) is an ancestor of the target commit ID. If so, no actual merge is required.⁴

In the "no actual merge required" git will normally replace the merge operation with a simple label-move: if HEAD is a symbolic reference to a branch name (i.e., if you're "on a branch" in git status terms), the branch name—the label—is simply "moved forward" to point to the target commit. (If you're in "detached HEAD" mode git does the same thing but writes the new ID into HEAD itself, rather than into the current branch's file.)

The --no-ff flag, if you give it, forces git merge to make an actual merge commit even if fast-forwarding is possible. (In the same-but-opposite fashion, --ff-only forces git not to make a merge, but if a true merge would be required, the only way to not make one is to fail, so --ff-only will fail in this case, while --no-ff succeeds in both "merge required" and "no merge required" cases.)

This answer is already long enough (or too long :-) ) so all I will say about adding --rebase is that the fetch step works the same, but the pull code then uses git rebase (with --onto, and with recent versions of git, some extra complications using "fork points") to rebase onto the fetched head (and must error out if you fetch multiple heads, since you can't rebase onto more than one head).

¹There's a special exception for the case where the current branch's upstream has a literal dot . as its "remote" name. In this case, the "merge" setting refers to another local branch, rather than a remote-tracking branch, and git pull skips the git fetch step since there's no need to fetch from yourself.

²This assumes you have git version 1.8.4 or newer. In older versions of git, the git fetch step brings over the branch heads, but fails to update origin/A and origin/B, for reasons that were intentional but turned out bad (hence the change in 1.8.4).

³As far as I know, git is unique in this respect. Other VCSes, including Mercurial, don't implement branches this way. Branches can still be renamed, but the names are much more deeply embedded and cannot be removed entirely, the way they can be in git.

⁴Alternatively, we can say that git first computes the merge-base, and if the merge-base is the current commit, no actual merging is required.

score 1 · Answer 3 · answered Dec 29 '15 at 02:29

Generally speaking, does the merge step merge the remote tracking branch into master or the current branch? My guess is the current branch, i.e. the branch pointed by HEAD, and it is not necessarily master.

Yes, the pull operation will not change branches, you'll end up (possibly pointing to a different commit) on the same branch.

Does git pull merge the remote tracking branch always into the current branch? (My guess is yes). if not, is there an argument to git pull that specifies the target branch in the merging step? If I am correct, the refspec argument to git pull doesn't specify the target branch for merging step.

Yes, you are correct, git pull does not checkout a different branch and there is not an argument for this. After the pull, manually use the checkout command to switch to a different branch.

why is the merge a "fast-forward merge"?

A "fast-forward" merge is one in which the fork-point of the topic branch is the tip of the original base branch. (master has no commits that are not reachable by target)

This is an important concept, so please take a look at this article. In-a-nut-shell a FF merge will not create a merge-commit. (which may or may-not be desirable in various situations).

Is it correct that the rebase step of git pull rebase the current branch (i.e. the branch pointed by HEAD) onto the remote tracking branch? If yes, why does the quote says "your local-tracking branch" instead the current branch?

git pull rebase will rebase HEAD with the branch that it is tracking.

score 0 · Answer 4 · answered Dec 28 '16 at 20:23

There is another source of confusion regarding git pull --rebase: doing a rebase where the merge is fast-forward.

After a git fetch (part of the pull --rebase)

-x--x (master)
     \
      o--o--o (origin/master, local tracking branch now up-to-date)

The rebase here doesn't need to take place. You cannot replay master "on top of" origin/master, since master is already included in the ancestors of origin/master.
All a git pull --rebase should do here is a fast-forward merge:

-x--x--o--o--o (master, origin/master)

That is what Git 2.12 (Q1 2017) proposes:

See commit 33b842a (29 Jun 2016) by Junio C Hamano (gitster). ^{(Merged by Junio C Hamano -- gitster -- in commit 2fb11ec, 19 Dec 2016)}

pull: fast-forward "pull --rebase=true"

"git pull --rebase" always runs "git rebase" after fetching the commit to serve as the new base, even when the new base is a descendant of the current HEAD, i.e. we haven't done any work.

In such a case, we can instead fast-forward to the new base without invoking the rebase process.

Confusions about the merge/rebase step of git pull

4 Answers4

`pull`: fast-forward "`pull --rebase=true`"

Linked

Confusions about the merge/rebase step of git pull

4 Answers4

pull: fast-forward "pull --rebase=true"

Linked

`pull`: fast-forward "`pull --rebase=true`"