Git pull with refspec

Question

I read this question , and now I have this doubt as to how git pull work with refpec :

Step 1 : I am on branchA.

Step 2 : I do `git pull origin branchB:branchC` .

Step 3: I notice : 

a) commits from branchB on remote comes and update `remotes/origin/branchC`

b) Then a merge happened. `branchC` was updated with `remotes/origin/branchC`

c) The `branchC` was merged into `branchA`.

Now, I am confused that since git pull = git fetch + git merge , then how does 2 merge happened here ? Step b) and Step c) are both merge.

torek · Answer 1 · 2018-05-29T14:42:37.250

phd's answer is correct. Break the git pull command into its two components:

git fetch origin branchB:branchC. Run this on the same setup, i.e., with branchC set to point to the commit it pointed-to before your git pull command.
git merge <hash-id>. The actual hash ID is taken from .git/FETCH_HEAD, where git fetch leaves it. Run this on the same setup, with branchA set to point to the commit it pointed-to before your git pull command.

Note that step 2, the git merge, has no effect on the reference branchC. It does have some effect on the current branch name, i.e., refs/heads/branchA. Since it runs git merge, it can do a fast-forward merge, or a true merge, or nothing at all.

Let's delve more into the fetch step, which is really the more interesting, or at least challenging, one.

`git ls-remote`

Before running git fetch origin branchB:branchC, run git ls-remote origin. Here's what I get running it on a Git repository for Git (with a lot of bits snipped):

$ git ls-remote origin
e144d126d74f5d2702870ca9423743102eec6fcd        HEAD
468165c1d8a442994a825f3684528361727cd8c0        refs/heads/maint
e144d126d74f5d2702870ca9423743102eec6fcd        refs/heads/master
093e983b058373aa293997e097afdae7373d7d53        refs/heads/next
005c16f6a19af11b7251a538cd47037bd1500664        refs/heads/pu
7a516be37f6880caa6a4ed8fe2fe4e8ed51e8cd0        refs/heads/todo
d5aef6e4d58cfe1549adef5b436f3ace984e8c86        refs/tags/gitgui-0.10.0
3d654be48f65545c4d3e35f5d3bbed5489820930        refs/tags/gitgui-0.10.0^{}
...
dcba104ffdcf2f27bc5058d8321e7a6c2fe8f27e        refs/tags/v2.9.5
4d4165b80d6b91a255e2847583bd4df98b5d54e1        refs/tags/v2.9.5^{}

You can see that their Git offers, to my Git, a long list of reference names and hash IDs.

My Git can pick through these and choose which name(s) and/or ID(s) it likes, and then go to the next phase of git fetch: ask them what hash IDs they can give me that go with, e.g., commit e144d126d74f5d2702870ca9423743102eec6fcd (the hash ID for their master). My Git would do this if I told it to bring over their master or their refs/heads/master as the left hand side of a refspec, since those name-strings match their refs/heads/master.

(With no refspecs, my Git will ask for all branches. The tags are trickier: --tags has my Git take all, --no-tags has my Git take none, but in between, there's some horribly twisty code inside git fetch.)

In any case, they offer some hashes, my Git says whether it wants or has some other hashes, and their Git uses their git rev-list to construct a set of hash IDs for commits, trees, blobs, and/or annotated tag objects to put into a so-called thin pack. During this phase of git fetch you see messages about the remote counting and compressing objects.

`git fetch origin`

Let me run an actual git fetch now:

$ git fetch origin
remote: Counting objects: 2146, done.
remote: Compressing objects: 100% (774/774), done.
remote: Total 2146 (delta 1850), reused 1649 (delta 1372)

Eventually, their Git finishes packing all the objects they will send, and sends those objects. My Git receives them:

Receiving objects: 100% (2146/2146), 691.50 KiB | 3.88 MiB/s, done.

My Git fixes up the thin pack (git index-pack --fix-thin) to make it a viable normal pack that can live in my .git/objects/pack directory:

Resolving deltas: 100% (1850/1850), completed with 339 local objects.

Finally, the most interesting-to-us parts of the fetch happen:

From [url]
   ccdcbd54c..e144d126d  master     -> origin/master
   1526ddbba..093e983b0  next       -> origin/next
 + 8b97ca562...005c16f6a pu         -> origin/pu  (forced update)
   7ae8ee0ce..7a516be37  todo       -> origin/todo

The names on the left of the -> arrows are their names; the names on the right are my Git's names. Since I ran only git fetch origin (with no refspecs), my Git used my default refspecs:

$ git config --get remote.origin.fetch
+refs/heads/*:refs/remotes/origin/*

so it's as if I wrote:

$ git fetch origin '+refs/heads/*:refs/remotes/origin/*'

which uses fully-qualified refspecs, rather than partial names like branchB:branchC. This particular syntax also uses glob-pattern-like * characters. Technically these aren't quite globs, as these are just strings and not file names, and there is a * on the right, but the principle is similar: I ask my Git to match every name starting with refs/heads/, and copy those to my own repository under names starting with refs/remotes/origin/.

The refs/heads/ name-space is where all of my Git's branch names reside. The refs/remotes/ name-space is where all of my Git's remote-tracking names reside, and refs/remotes/origin/ is where my Git and I have placed the remote-tracking names that correspond to branch names we found in the Git at origin. The leading plus sign + in front sets the force flag, as if I had run git fetch --force.

Reference name updates

The next step requires that we look at the commit graph—the Directed Acyclic Graph or DAG of all commits found in my Git repository. In this case, since the new pack file has been integrated, this includes all the new objects I've just added via git fetch, so that I have new commits (and any trees and blobs necessary to go with them) obtained from their Git.

Each object has a unique hash ID, but these are too unwieldy to use directly. I like to draw my graphs left-to-right in text on StackOverflow, and use round os or single uppercase letters (or both) to denote particular commits. Earlier commits go towards the left, with later commits towards the right, and a branch name points to the tip commit of that branch:

...--o--o--A   <-- master
            \
             o--B   <-- develop

Note that in this view of the Git object database, we pay no attention at all to the index / staging-area, and no attention at all to the work-tree. We are concerned only with the commits and their labels.

Since I actually obtained my commits from the Git at origin, my Git has origin/* names as well, so let's draw those in:

...--o--o--A   <-- master, origin/master
            \
             o--B   <-- develop, origin/develop

Now, suppose that I run git fetch and it brings in two new commits that I will label C and D. C's parent is A, and D's is the node just before B:

             C
            /
...--o--o--A   <-- master
            \
             o--B   <-- develop
              \
               D

For my Git to retain these commits, my Git must have some name or names by which it can reach these commits. The name that reaches C is going to be origin/master, and the name that reaches D is going to be origin/develop. Those names used to point to commits A and B respectively, but git fetch origin +refs/heads/*:refs/remotes/origin/* tells my Git to replace them, giving:

             C   <-- origin/master
            /
...--o--o--A   <-- master
            \
             o--B   <-- develop
              \
               D   <-- origin/develop

The output from this git fetch will list this as:

   aaaaaaa..ccccccc  master     -> origin/master
 + bbbbbbb...ddddddd develop    -> origin/develop  (forced update)

Note the + and the three dots in the output here. That's because while moving origin/master from commit A (hash ID aaaaaaa) to commit C was a fast-forward operation, moving origin/develop from commit B to commit D was not. This required the force flag.

This same process works even if you use local branch names

If you run git fetch origin br1:br2, you instruct your Git to:

call up the Git at origin (really remote.origin.url)
obtain their list of branch names
use their br1 (probably refs/heads/br1) to update your br2—most likely your refs/heads/br2, bringing over whatever objects are necessary to make this happen.

This update phase, updating your br2 based on their br1, does not have a force flag set on it. This means that your Git will permit the change if and only if the operation is a fast-forward.

(Meanwhile, your Git will also update your origin/br1, because Git does this kind of opportunistic update based on remote.origin.fetch. Note that this update does have the force flag set, assuming a standard remote.origin.fetch configuration.)

Fast-forward is really a property of a label move

We (and Git) talk about doing a fast-forward merge, but this is a misnomer, for two reasons. The first and most important is that fast-forward is a property of a label's motion. Given some existing reference label (branch, tag, or whatever) R that points to some commit C1, we tell Git: move R to point to commit C2 instead. Assuming both hash IDs are valid and point to commits, when we examine the commit DAG, we will find that:

C1 is an ancestor of C2. This change to R is a fast-forward.
Or, C1 is not an ancestor of C2. This change to R is a non-fast-forward.

The special property of a fast-forward operation is that now that R points to C2, if we start at C2 and work backwards as Git always does, we will eventually come across C1. So C1 remains protected by a name, and if R is a branch name, commit C1 is still on branch R. If the operation is not a fast-forward, C1 is not reachable from C2, and C1 may no longer be protected and may—depending on whether anything else protects it, and its relative age—be garbage collected at some point in the future.

Because of the above, updating a branch style reference—a branch name in refs/heads/ or a remote-tracking name in refs/remotes/—often requires using a force flag, if the update is not a fast-forward. Different parts of Git implement this differently: git fetch and git push both have --force and leading-plus-sign, while other Git commands (that don't have refspecs) just have --force or, as in the case of git reset, just assume that you—the user—know what you are doing.

(Very old versions of Git, 1.8.2 and older, accidentally applied these fast-forward rules to tag names as well as branch names.)

The `git merge` command knows about the index and work-tree

What makes a git merge fast-forward merge operation different—well, at least slightly different—from this kind of label fast-forwarding is that git merge knows about, and works with, your index / staging-area and your work-tree. When you run:

git merge <commit-specifier>

Git computes the merge base of the current HEAD commit and the given other commit. If this merge base is the current commit, the operation can be done as a fast-forward label move, as long as Git also brings the index and work-tree along with it.

If the merge base is an ancestor of the current commit, or if you use the --no-ff flag, git merge must perform a true merge, and make a new merge commit. (Of course there are also flags to suppress the commit and to make the new commit as an ordinary, non-merge commit, so this view of git merge skips a few important details as well.)

Q1) "***..Since I actually obtained my commits from some other Git, my Git has origin/* names as well..***" , I did not understood. You mean to say you had origin pointing to some remote repo and then you made it point to some other remote repo ? Did you do that for explanation purposes as otherwise why would some one do like this ? If they want some other remote , they create it by some other name. — Number945, May 29 '18 at 13:14
Q2) In the diagram you have taken , instead of your command `git fetch origin` , if I had used `git fetch origin develop:develop` , then according to you , my `origin/develop` will get updated (like non fast forward update as in diagram) and then git fetch will try to update develop branch but since this would be a non fast forward update with origin/develop , this will fail. Am I right ? And if that be so , will changes made to origin/develop revert ? (Assume standard values for remote.origin.fetch) — Number945, May 29 '18 at 13:36
Re Q1: that was just generic phrasing. In this *specific* repository, the "other Git" is the one at `origin`. The idea was to emphasize that while I do have those commits and *I* created the names `master` and `develop`, I got *those* commits from some other Git through `git fetch` and therefore I have *more* names that point to those commits. I'll rephrase that as "from `origin`" though. — torek, May 29 '18 at 14:38
Re Q2: Yes, `git fetch origin develop:develop` will indeed attempt a non-forced update of my own `develop` (the `:develop` part) based on what I receive from `origin` for their `develop` (the `develop:` part). Since that's a non-fast-forward, that part of the update will fail. I would have to run a test to find out whether this aborts the updating of `origin/develop`, but I suspect it does not. — torek, May 29 '18 at 14:41
Thnx for the reply. However , based on more clarity I got from the replies , I have one small doubt. Q3) There is a strange thing going on in the diagram. When we do git fetch , the commit `D` gets onto `O` and not onto `B` which the remote tracking branch pointed to earlier. Such a scenario can only exist when somebody must have forced updated `develop branch` in the remote repo. Other wise `D` must always be descendant to `B`. Am I right ? — Number945, May 29 '18 at 14:52
Yes: the slightly strange behavior of commit `D` is a result of a force-push to `origin` (if `origin` is a typical server; if it's another repository you control from a command-line Git, it could be the result of a `git reset`, or an interactive rebase, or something along those lines). — torek, May 29 '18 at 14:56

score 2 · Answer 2 · answered May 26 '18 at 17:51

2

Step 2 is not a true merge, it's a fast-forward merge. Fast-forwarding is the only kind of merge possible for a non-current (i.e., not currently checked out) branch. If fast-forwarding is not possible git would abort fetch/pull; in that case you could either do a true merge (checkout branchC and run git pull origin branchB) or do a forceful update (git fetch origin +branchB:branchC) thus loosing your local commits at the head of branchC.

answered May 26 '18 at 17:51

phd

82,685
13
120
165

See, `git fetch` does not do any fast forward merging. Then who ordered the 2 merges ? My command got broken down = `git fetch origin branchB:branchC` + `git merge branchC` ? Am i right ? – Number945 May 26 '18 at 18:24
1

'git fetch origin branchB:branchC` **does** fast forward merging! If it couldn't it would fail with an error message. – phd May 26 '18 at 18:54
I believe ur answer might not be completely correct in few places which I have highlighted here : https://stackoverflow.com/a/50654727/2844702 – Number945 Jun 02 '18 at 07:55

score 1 · Accepted Answer · answered Jun 02 '18 at 07:52

Well, after reading @torek-ans-1 and @torek-ans-2 [This is must read to understand the working of git fetch/pull], I feel to post an complete answer to my question for those who want to get it quickly.

First, the steps in the question are wrong. This is the correct steps :

Step 1 : I am on branchA.

Step 2 : I do `git pull origin branchB:branchC` .

Step 3: I notice : 

a) commits from branchB on remote comes and update `refs/heads/branchC`

b) Then based on `remote.origin.fetch` was used to try to update `remotes/origin/branchB` on our local.

[ Notice that no attempts will be made to update `remotes/origin/branchC`]

c) The `branchC` was merged into `branchA`.

[Order might vary from one git version to other]

In step a) + step b) , there is no merge. This is called fast forward update. There is something called fast forward merge too which behaves like this but we say fast forward merge when git merge behaves like a fast forward update.

Here in step a)+ step b) no git merge is called . Hence, we call it fast forward update and not fast forward merge.

Step c) is where git merge will be called.

In short : git pull origin branchB:branchC= git fetch origin branchB:branchC ((a) + (b))+ git merge branchC (c)

Now my question was why 2 merge called ?

There are not 2 merge . There is only 1 merge in step c). Yes, there are 2 fast forward update and git fetch does them.

Git pull with refspec

3 Answers3

`git ls-remote`

`git fetch origin`

Reference name updates

This same process works even if you use local branch names

Fast-forward is really a property of a label move

The `git merge` command knows about the index and work-tree

Linked

Git pull with refspec

3 Answers3

git ls-remote

git fetch origin

Reference name updates

This same process works even if you use local branch names

Fast-forward is really a property of a label move

The git merge command knows about the index and work-tree

Linked

`git ls-remote`

`git fetch origin`

The `git merge` command knows about the index and work-tree