You are doing things that are not "built in" to Git, so there is not necessarily a correct way. There are merely a lot of options. There is a way—or multiple ways, really—to deal with this that is built-in, so you might want to switch to that, but let's get there by starting with what you are doing.
Let's look first at what happens when you run git fetch
. I'll assume here that you have two remotes, one named origin
and the other named upstream
. Here's what these remote names do for you:
Each remembers a URL.
The URL for origin
and the URL for upstream
are different, and do not even both need to be on GitHub (although since you mention GitHub forks, I assume both are). You get to type in a shorter name, though.
Each also provides a prefix for remote-tracking branch names. You now have both origin/master
and upstream/master
, for instance.
It's this second part that you can (in management-speak) "leverage" for the pull requests. But first, let's talk about branches, branch names, and name spaces.
Git branches vs Git branch names
The term branch, in Git, is ambiguous. It can mean a branch name like master
or pr-5737
, but it can also refer to a branch structure within the commit graph. Each commit has its own unique hash ID, and each commit also records the ID of a previous commit. For instance, in a new repository with just three commits, we might have:
A <- B <- C <-- master
where the name master
contains the ID of commit C
, and C
itself contains the ID of commit B
. We say that master
points to C
, C
points to B
, and B
points to A
. Since A
is the very first commit, there's nothing for it to point back to; we call this a root commit. These pointers all work backwards—the branch name finds the tip commit for us (and for Git), and the tip commit finds an earlier commit, which finds more commits, all the way back to the root.
Adding a new commit simply means writing a commit that points back to the current tip, and then updating the branch name to point to the new commit:
A <- B <- C <- D <-- master
I generally draw these without the internal backwards arrows, which makes the drawing more compact and allows displaying branches:
A--B--C--D <-- master
\
E--F <-- feature
Note, again, that the branch names only point to the (two, in this case) tip commits. It's those commits that point back to the rest of the commits. These chains of commits are also Git branches. For more on this, see What exactly do we mean by "branch"? We use the term branch interchangeably, in Git, to mean both branch names and branch structures within the commit graph. It's usually obvious which one is meant—though if it's not, you should ask!
One key item here is that commits can be pointed-to by things that are names, but are not branch names. For instance, tags can also point to commits (and of course commits point to commits). There are very few restrictions on these names, and in fact, pull requests are just yet another category of names that point to commits. The general term for all of these is references: in Git, a reference is any name for a commit—usually the tip commit of a branch (the graph branch)—or any other Git object (but we'll ignore the three other types of Git objects here).1
The other key item is that in normal (non-maintenance-command) usage, Git can only get started, in terms of finding commits in the commit graph, using these names. It needs a branch name, or tag name, or something, to find a starting ID. (You can give your Git a raw ID, possibly abbreviated, which is what you do if you run git log ac0ffee
for instance. It starts from that commit and works backwards from there.)
1Some parts of Git assert that all references start with refs/
, but others note that there are special non-refs/
references, such as HEAD
, ORIG_HEAD
, CHERRY_PICK_HEAD
, MERGE_HEAD
, and so on. I would put it as most references start with refs/
.
git fetch
brings in commits
When you run git fetch
, you have your Git call up another Git. That other Git has its own repository. Your Git uses the URL stored in under the remote name to contact the other Git, and then your Git and their Git have a little conversation. Their Git tells your Git about their branch tips—the names, and the commits—the hash IDs—identified by those names.
These hash IDs are computed, in a complicated (cryptographic) but fully deterministic way, from the contents of the commits. Any two commits that are 100% identical, bit-for-bit, have the same IDs. This means that any commits that your Git and their Git both have in both repositories, have the same IDs. So, your Git can tell if you already have their commits, or not.
If your Git doesn't have their commits, and you've told your Git to bring them in (by asking for that name, or all names), your Git then requests those commits from their Git. Their Git bundles them up and sends them over, and your Git stores them in your repository, using their unique hash IDs.
Fetch must now assign names
Git now needs some way to find these commits, now that it has these commits. This means that git fetch
must save some names. Of course, all of the branch tips and other tip commits that your Git got from the other Git, had names over there. Why can't it just use those names?
Think about this for a moment and the reason becomes obvious. Suppose the name it got was "branch master", and it overwrote your master
with this new value. You'd lose easy access to your own commits!2
The simplest way for git fetch
to save these name/ID pairs is to write them all into a file named FETCH_HEAD
. They will stay there, safe and sound, until the next git fetch
overwrites them.
That's what you are doing with this (second) command:
$ git fetch upstream pull/5737/head
remote: Counting objects: 4, done.
remote: Total 4 (delta 3), reused 3 (delta 3), pack-reused 1
Unpacking objects: 100% (4/4), done.
From https://github.com/OrigProj/repo-name
* branch refs/pull/5737/head -> FETCH_HEAD
The remote:
messages are coming from that other Git: it's packaged up four objects (probably one commit, one tree, and two blobs, though I am just guessing) into a thin pack with delta compression applied ("delta 3"). Your Git got the package and unpacked it. Your Git used one name from their Git—refs/pull/5737/head
—just as you told it to. And, your Git did not store this under your own name, but merely in the FETCH_HEAD
file.
If you like, you can now extract this commit's ID from the file FETCH_HEAD
. You can do that by looking inside the file (the format will be reasonably obvious), or—since there's only one ID in it—just by using the name FETCH_HEAD
. Just remember that the next fetch will overwrite the file, forgetting the ID. Once this forgetting happens, if the new commit you just got has no names that can find it, that makes the commit eligible for garbage collection: it will eventually get thrown away. But you have a chance, now, to give it your own name(s).
Let's compare, though, to your first command:
$ git fetch upstream pull/5737/head:pr-5737
Note the colon here, and that the output ended with:
* [new ref] refs/pull/5737/head -> pr-5737
The earlier command did not say FETCH_HEAD
. Your Git wrote the name, instead, to a new reference, pr-5737
. This reference is in fact refs/heads/pr-5737
, through some assumptions that git fetch
makes.3 For now, let's just note that the "full name" of any branch is refs/heads/branch
, e.g., refs/heads/master
for the branch-name master
.
This colon-separated form, by the way, is a refspec. A refspec is only slightly more or less than a pair of branch names: a source name, and a destination. With git fetch
, the source is their name (branch or other reference), and the destination is your name. You may choose any kind of name, not just a branch name, for either side. For fetch
, leaving out the destination name like this means you want to just write the information to FETCH_HEAD
.4
2But note that this is (partly) how tags work. The idea of tags is to be global across all repository clones. The question then becomes whether and when your own tags will be overwritten by another Git's tags, and the answer to that is complicated (and not appropriate for this posting).
3Both git fetch
and git push
have some complicated code to qualify an unqualified reference. A name like master
or branch
or pr-5737
does not start with refs/
, so it is unqualified. If you write out refs/pull/5737/head
or similar, this does start with refs/
and is qualified, and does not go through this complicated bit of code. In a few cases, Git can't do the qualification on its own, or does it wrong, and makes you write out the full name. That's true for these pull names, for instance. Usually, though, it does pretty well at guessing whether something is a branch or a tag. In this case, it guesses that you meant to make a branch, which is probably correct.
4Since Git version 1.8.2, Git will do an opportunistic update of remote-tracking branches. We'll see more about this at the very end of this article.
Fast-forwarding and force-updating
I cannot figure out the command to ‘merge’ these newly fetched commits to the head of the pull request branch.
It's time to go back to the graph drawings.
Your first git fetch
brought in some commits—probably just one, again—but it gave them a branch name in your repository, pr-5737
. That commit itself points back to a previous commit, which I'll guess for now is one that is mainly or only on their (upstream
's) branch master
. The graph fragment is therefore:
...--o <-- upstream/master
\
o <-- pr-5737
Now, it may be that you have already updated your own master
to match upstream/master
. In that case, we should draw the graph this way:
...--o <-- master, upstream/master
\
o <-- pr-5737
Note that the commit graph is unchanged! All we did was change the labels a bit. That's the key to this process. The fetch will always get you the commits; what you want to do is to change (or perhaps add) labels, such as branch names, as you incorporate these commits into your own graph. The fetch
step modifies—adds to—the commit graph, and after that, we need something to happen with names.
So, let's look at the graph after the second git fetch
. First, let's assume this adds one more commit that points back to the previous pull-request commit:
...--o <-- upstream/master
\
o <-- pr-5737
\
o <-- FETCH_HEAD
In this case, you probably want to move pr-5737
to point to the same commit as FETCH_HEAD
. But, what if the new commit doesn't point back to the previous pull-request commit? What if instead, it points to the origin/master
commit? Well, let's draw that:
...--o <-- upstream/master
|\
| o <-- pr-5737
|
\
o <-- FETCH_HEAD
Now you probably want to move pr-5737
to point to the new commit. (Or, maybe you don't want that: it's up to you to decide what you want. But I'll assume that you do want it.)
There are a bunch of Git commands that will move a branch label. The most user-oriented is git branch
: with git branch --force
you can re-set any branch that you don't currently have checked out. (To re-set the branch you do have checked-out, you need to use git reset
instead, for a bunch of good technical reasons that amount to Git letting the implementation show through.) You could just run:
git branch -f pr-5737 FETCH_HEAD
to forcibly move the name pr-5737
to point to the commit git fetch
just brought in.
(Again, if you have pr-5737
checked out at the moment, you have to use git reset
instead, and then you must choose: --soft
, --mixed
, or --hard
? These control whether the reset operation affects the index and work-tree. Let's just assume that you don't have it checked out. :-) )
Now, if the new commit we just brought in, the one at FETCH_HEAD
, "adds to the branch"—i.e., the new commit(s) point(s) back to the tip of pr-5737
—this branch label movement is what Git calls a fast-forward. Note how, in the first FETCH_HEAD
drawing above, Git can kind of slide the label forward (rightward, while also going down) to the new commit:
...--o <-- upstream/master
\
o
\
o <-- pr-5737, FETCH_HEAD
With the second drawing, however, forcing pr-5737
to point to the new commit causes the old one to be forgotten! We have to back the label up one step, to point to the tip of upstream/master
, and then back down and right. This is a non-fast-forward forced update.
If we use git branch -f
, it will update pr-5737
even if it cannot be fast-forwarded. What if you only want to move pr-5737
if it's a fast-forward?
Merge can also fast-forward
You have no doubt seen Git print "Fast-forward" when merging. This is because if you are on one of your branches, and run git merge name
, Git will check whether the commit it finds under the given name
—usually, the tip of a branch—is "fast-forward ahead" of the tip commit of the current branch. If so, Git doesn't actually merge anything, it just slides the branch name forward (and checks out the new commit, so that your index and work-tree match the new branch tip).
If you use the git merge
command, you can limit it to working only if the merge will really be a fast-forward instead, using --ff-only
. (And, you can force it not to do a fast-forward at all, but rather make a new merge commit, using --no-ff
.) So, you could git checkout pr-5737
and then git merge --ff-only FETCH_HEAD
to make the fast-forward happen. Of course, this could fail, as it would for the second case. Then you have to decide what you want to do.
You probably don't want to merge these two commits. (If you really do, for whatever reason, you can: just run git merge FETCH_HEAD
. That's probably not useful though.) You probably just want to force the branch to move and to load the new tip commit into your index and work-tree, in which case, you can git reset --hard FETCH_HEAD
. If you go this way, though, you'll know—based on whether git merge --ff-only
worked—whether the updated pull request is a replacement, or an add-on.
Doing it the easy way: git fetch
can do it for you
In your original git fetch
you told your Git to write the name pr-5737
:
$ git fetch upstream pull/5737/head:pr-5737
You can use this exact same command again. Your Git will obtain any new commit(s), and then try to update your existing refs/heads/pr-5737
.
As before, this could be a fast-forward. In this case, your Git will do the update. (You still get a FETCH_HEAD
file but you don't need it any more.) Or, it could be a non-fast-forward. In this case, your Git will error-out:
! [rejected] refs/heads/pr-5737 -> pr-5737 (non-fast-forward)
To force the update, we use one more feature of a refspec: it can start with a plus sign, which means "force". So:
git fetch upstream +pull/5737/head:pr-5737
This time, you'll get the update, with the annotation (forced update)
added. The annotation is only added if the update is actually forced, so as with doing a manual git merge --ff-only
, you will know whether the update had to be forced.
The really easy, fully automated way
Now, it might be nice if you could get git fetch
to do this update without having to type:
git fetch upstream +pull/5737/head:pr-5737
all the time. And there is—and in fact, you can bring in any and all pull requests that have the form refs/pull/NNNN/head
, or any other form you care to recognize. It's up to you to decide how to bring them in, but before we dive into the mechanism, let's mention name spaces, and the role of a remote name in remote-tracking branch names.
A name space (or namespace as a single word) is an organization, typically hierarchical, where different groups of names are, well, grouped. In Git's case, for instance, most references are under refs/
, but all branch names are under refs/heads/
, as we saw earlier. All tags are under refs/tags/
. These names work like directories (and are actually implemented as such, in some cases). The space beginning with refs/remotes/
holds all remote-tracking branch names, but it's further subdivided: there is one space for the remote named origin
, under refs/remotes/origin/
, and another for upstream
, under refs/remotes/upstream/
.
By sub-dividing the remote-tracking branches, and separating them from your regular branches, Git guarantees that it will never use, as a remote-tracking branch name, any of your own branch names. Your own branches all start with refs/heads/
, and refs/remotes/origin/
does not start with refs/heads/
, so these names are separate. Moreover, by including the name of the remote, Git tries to guarantee that these also never collide: refs/remotes/origin/
is always different from refs/remotes/upstream
.5
If you allow pull requests, of the form pr-number
, to occupy the same name space as your branches, and if you name one of your own branches, say, pr-123
, you can get a collision. So don't do that: either make sure you never name your branches like this, or pick your own name space for your pull-request-trackers. You may want to stick with branch names since Git only has three built-in forms it recognizes, for branches, tags, and remote-tracking branches; so branch names are shorter to type. (That's why you had to spell out pull/5737/head
rather than just 5737/head
: the full name is refs/pull/5737/head
, and Git can find this under refs
, but not without the pull
part. Your master
's full name is refs/heads/master
, but you don't have to type in heads/master
.)
(For reasons I will mention in a moment, you might want to spell your dedicated pull-request sub-branch name space pr/*
instead of pr-*
. I'll assume from now on, you want pr/5737
instead of pr-5737
.)
If you open your .git/config
file in your editor—note that you can do this with git config --edit
, so Git kind of encourages this; it's relatively safe as long as your editor does not try to convert the configuration to rich text or something equally silly—you will see a configuration section for each remote:
[remote "origin"]
url = ...
fetch = +refs/heads/*:refs/remotes/origin/*
[remote "upstream"]
url = ...
fetch = +refs/heads/*:refs/remotes/upstream/*
The url
lines provide the saved URLs—they are how your Git knows how to call up the other Git. The more interesting lines, for us, are the fetch
ones. These provide the default fetch refspecs.
If you run git fetch origin
, that means
git fetch origin +refs/heads/*:refs/remotes/origin/*
This is how Git implements the remote-tracking branches: a git fetch
just obtains the refspecs you gave it, even if those are the ones implied by the per-remote fetch
default configuration. Git also allows more than one fetch =
line, and it adds all of them as the default set of refspecs.
This shows that refspecs can also do a kind of wild-card matching. This matching is a limited form of shell style glob match. This means you can add a second fetch =
line that reads:
+refs/pull/*/head:refs/heads/pr/*
Now, if the remote has a reference named refs/pull/5737/head
when you run git fetch
, your Git will create or update—forcibly if needed—your own branch pr/5737
.
If your Git is new enough, you can use one *
glob pretty much anywhere, e.g., the rather peculiar:
+refs/pull/5*/head:refs/heads/pr-5*
which will obtain only pull requests starting with 5
, updating your own branch name starting with pr-5
. But in versions of Git before 2.6.0 (commit cd377f4), the *
had to match a whole component, e.g., pull/*/head
but not pull/5*/head
, or pr/*
but not pr-*
. If your Git is 2.6.0 or later, you can fetch to pr-*
, but if not, you must fetch to pr/*
.
(Since Git version 1.8.2, if Git is fetching a reference, and it has one of these matching fetch =
lines, your Git will update the corresponding remote-tracking branch, automatically, even if you gave some refspecs on the command line. This does not occur in older versions of Git. But even in those older versions of Git, if you succeed at a git push
that pushes something matching one of these refspecs, Git will opportunistically update the remote-tracking branch. The fact that push did it was what finally convinced the Git folks that fetch should do it too.)
5This attempt fails in subtle ways if you name one remote a
and another a/b
. Git should forbid slashes in remote names, but it doesn't. (So, don't use slashes in remote names—or if you do, make sure none is ever a prefix of another.)
There are more options
You don't have to make the pull requests into (regular) branches. You could add them as remote-tracking branches, for instance, using your own invented pr/
sub-name-space:
[remote "upstream"]
+refs/pull/*:+refs/remotes/pr/upstream/*
This will turn their pull/5737/head
into your pr/upstream/5737/head
remote-tracking branch. You can now choose whether to git checkout 5737/head
to create your own local branch named 5737/head
that has the remote-tracking branch pr/upstream/5737/head
(this is for a "remote" named pr/upstream
—which you don't have, but that's fine; though you will have to be sure not to name a new remote "pr")—as its upstream. (That is, 5737/head@{u}
will name pr/upstream/5737/head
.)
The obvious drawback is that the name is a bit clunky. A less obvious one is that if you collect pull requests from multiple remotes, 5737/head
might match both pr/upstream/5737/head
and pr/another/5737/head
remote-tracking branches, if both remotes have a pull request #5737 outstanding. In this case the git checkout
DWIM feature, that knows how to create local branches based on remote-tracking branches, will fail: it won't pick one arbitrarily for you.
There's also no clear advantage to this. You get your own branch, so you can make your own commits—but why would you want to? The drawback to force-fetching into your own branch space, using the earlier scheme without remote-tracking branches, is that you might clobber your own commits if you forget that pr/5737
is set up this way and make some commit there that you wanted to keep. (But even then, your pr/5737
reflog will preserve your commit for 30 days by default.)
Hence, I am not sure why you might want this—but it's an option. The fetch = ...
mechanism is just that: a mechanism, not a policy. It's up to you how to use it.