TL;DR
For an immediate fix, I recommend using this sequence of commands:
git fetch origin
git checkout master
git merge --ff-only origin/master # make sure we are up to date
git fetch <fork_repo_url> <fork_branch>
git merge --no-ff FETCH_HEAD -m "<insert a GOOD merge message here>"
git push origin master
The good merge message is up to you to create. The standard low quality merge message that Git provides is "merge branch <fork_branch> of <fork_repo_url>". This describes the action taken, without giving a reason why the action was taken. It's a bad (low quality) message because the presence of the merge itself also records the action taken, though not the specific URL and branch name. The specific URL and branch name are typically not useful, so in effect, this merge message is "this commit is a merge commit", which is itself wholly redundant and therefore useless.
Ideally, you should add a remote to name the fork, after which you can shorten this slightly:
git fetch other-remote
git merge --no-ff other-remote/<fork-branch> -m "..."
(and after which, omitting the -m
argument inserts Git's standard low-quality merge message, which is convenient if you're automating this and can't provide a better one.)
Long
Your immediate problem has nothing to do with making the merge commit (although there may be a problem here as well). The issue with this series of commands:
git fetch <Fork_Repo_URL> <Fork_Repo_Branch>
git checkout -b <Branch_Name> FETCH_HEAD
git fetch origin
git checkout origin/master
git merge --no-ff <Remote_Name>-<Branch_Name>
git push origin master
lies in the middle of the second group of commands, or with the final command, depending on how you wish to treat this. There's also a likely error with the git merge
command, but we'll get to that later.
The command:
git checkout origin/master
produces what Git calls a detached HEAD. There is nothing fundamentally wrong with this state; you just need to understand what it does if you plan to use it.
The final command:
git push origin master
has your Git call up the Git at origin
and request that they set their name master
to the commit hash ID your Git currently has stored under your name master
(after handing them any new commits required to achieve this). If your master
and their master
both hold the same commit hash, this step achieves nothing. Your intermediate command has made a new merge commit, but has not put it on any branch, because you are operating with a detached HEAD.
You have multiple options for fixing this. For instance, you could replace the second group of three Git commands with:
git fetch origin
git checkout master
git merge --ff-only origin/master # make sure our master matches theirs
# do whatever is required here if this step fails
git merge --no-ff <Branch_Name>
(Note that I have changed the argument to git merge -no-ff
here to use the name you created in the first pair of commands. We'll see a better way to deal with that in a moment.)
This way, the result of the middle group of commands is to create a new merge commit—the --no-ff
enforces a true merge even if a fast-forward is possible—with your (local) branch name master
being updated to contain the new commit. That is, the new merge we just made is now on branch master
, as git status
will tell you. This feeds into the final command—git push origin master
—so that the git push
transfers to origin
the new merge commit you just made, then requests that they set their master
to point to this new merge commit.
Alternatively, we can work with the detached HEAD by changing the final git push
command to read:
git push origin HEAD:refs/heads/master
This request uses a longer / more-detailed refspec, HEAD:refs/heads/master
, in place of the simple refspec master
. These refspec arguments are the arguments we pass to git push
and git fetch
after the name-or-URL of the remote. That is, the general form of the command is git push remote refspec1 refspec2 ... refspecN
, with every argument after the remote
being a refspec
.
A refspec is, in this second-simplest form, just two identifiers separated by a colon :
. The identifier HEAD
means that the source of the push—the commit(s) to be pushed from our (local) Git to the remote Git—should be read from HEAD
. The second identifier, refs/heads/master
, is the full name of the master
branch in their Git: this is the name we will have our Git ask their Git to update.
The reason for spelling the whole thing out, refs/heads/master
, is that when we use the simpler master
refspec—the one without a colon—Git intuits from this that we mean our master
branch and therefore we must also mean their master
branch. But we're anticipating, in this case, that we are on no branch: that we have a detached HEAD. Our Git will not be able to infer that we must therefore mean their master
branch too. Perhaps we mean their tag master
.1 By spelling out refs/heads/master
we remove all ambiguity: we definitely mean their master
branch.
Let's also take a look into this peculiar bit of code you've shown:
git fetch <Fork_Repo_URL> <Fork_Repo_Branch>
git checkout -b <Branch_Name> FETCH_HEAD
You create a (local) branch named Branch_Name
, but then never use it. Why?
1It's a very bad idea to create a tag named master
, and if no one ever does this, our careful use of refs/heads/master
here is not actually required. So we're really just being careful to avoid tripping over someone else's error here.
Git is about commits, and not so much about branches
First, Git is really all about commits. Git is not about files—though files do get stored in commits, as snapshots—and mainly provides branch names for us dumb humans, who cannot deal with big ugly commit hash IDs. The true name of any one commit is its big ugly hash ID, such as 08da6496b61341ec45eac36afcc8f94242763468
. Without this hash ID, Git can do nothing at all. But humans just can't deal with hash IDs (quick, was that 08da64something, or 08da46something? did you have to look back at it to find out? can you remember the whole thing from one minute to the next?). Humans can, however, remember names like master
. So Git will let us use names to substitute for these hash IDs.
Every hash ID is guaranteed to be unique. However, every Git in the universe is required to agree that two identical commits have the same hash ID, so that you can pass commits from one Git to another through git clone
, git fetch
, and git push
. If you can find two different commits that have the same hash ID, you've essentially found a way to "break" Git. For much more detail about this, see How does the newly found SHA-1 collision affect Git?
Each commit, as we just said, stores a snapshot. That is, it has a full and complete copy of every file.2 That's the data for the commit—but it also has important metadata, such as who made the commit, when, and why: the log message they supplied when they made the commit, to tell us what the purpose of this new commit was. One of the most important pieces of metadata here is that each commit stores the raw hash ID of its immediate predecessor or parent commit(s). Most commits have exactly one parent hash ID. Merge commits are defined as any commit that stores two or more parent hash IDs (and most of these store exactly two).
These parent hash IDs are how Git finds a previous commit. Since every commit stores its immediate parent, we can always start at the last commit and work backwards. If we let single uppercase letters stand in for real commit hash IDs, the commit whose hash ID is H
stores its parent's hash ID, which we can call G
. We say that commit H
points to commit G
:
G <-H
Meanwhile, commit G
has a parent hash ID, so it points to its own parent F
:
<-F <-G <-H
Like G
, F
points to its parent. If the whole repository has exactly eight commits, they might look like this:
A <-B <-C <-D <-E <-F <-G <-H
Commit A
is the very first commit we ever made, so it can't pointing to its parent. It doesn't have a parent, so it just doesn't list any parent at all. This rather special kind of commit is a root commit and every non-empty repository has at least one.3
Much of Git's job is simply following all of these pointers when necessary. These commits, with their internal backwards-pointing arrows, are the history in the repository. Each commit has its own individual snapshot of all of your files. Git produces diff listings by comparing any two commits. The changes in a commit are whatever is different from parent to child, and the parent commit is recorded by hash ID in the child. It's really that simple.
Note, though, that real hash IDs are random-looking as well as big and ugly. We can find H
from the diagram above because H
is the last letter, but that doesn't work with real hash IDs. So Git has to have a way to store the hash ID of the last commit in some branch, as well as to give us mere humans a way to identify that commit. This is where branch names come in.
2That is, it has a full and complete file of every file that it has, but that's kind of redundant. The idea here is that if you make some commit that has, say, a README
file, then make a second commit in which you have not changed the README
file, this second commit has a copy of the README
file. Git can and does actually share the two identical files, which it stores as blob objects, which have hash IDs just like commits. Hence, no matter how many times you commit one particular version of a file, the repository stores only one copy of it. Because all internal Git objects are completely and totally read-only, it's easy to share them. They cannot be changed, so re-using some existing object is just a matter of re-using its hash ID.
3Most repositories probably have only one root commit, though you can make additional root commits once you know how. You can also acquire new root commits by connecting your Git to some unrelated Git—a Git that you never git clone
ed from, nor cloned from a clone of it, etc. That other Git has its own root commit, with its own unique hash ID. You obtain their commits so that they are now copied into your repository, and now their root commit is another root commit in your own Git.
Branch names are pointers to commits
A branch name simply stores the hash ID of the last commit in the branch. To see how this works, imagine our eight-commit repository has one branch name, master
. The name master
will point to commit H
by storing the raw hash ID of commit H
, and we can draw it like this:
A--...--G--H <-- master
If we now make a new commit, Git will set the new commit's parent hash ID to H
, write out the new commit to discover its hash ID, and see that the new commit's hash ID is I
(well, really, some big ugly hash ID):
...--G--H <-- master
\
I
Now Git simply writes I
's hash ID—whatever it really is—into the name master
, giving us:
...--G--H--I <-- master
The name now points to the last commit, as it always does. From the last commit, we—or Git—will work backwards, following the backwards-pointing arrows embedded inside each commit. (I drew them as lines here just because it's easier on StakcOverflow, especially when I had to draw commit I
on a diagonal below H
. If you can draw good backwards-pointing arrows, that's not a bad idea, whenever you draw commits. It's a reminder that Git always works backwards.)
HEAD
is something we usually attach to a branch name
Every time we make a new commit, Git just writes the hash ID of the new commit into a branch name. But—which one? To make more-than-one branch name work, in a Git repository, Git needs to remember which branch we're on.
The way Git does this is by "attaching HEAD". The special name HEAD
, written in all-capital letters like this, is normally attached to one branch name. Let's go back to our eight-commit state:
...--G--H <-- master (HEAD)
Now let's create a new name, dev
, that also points to commit H
. We'll leave HEAD
attached to master
, because we will use git branch dev
to create dev
, so that we have:
...--G--H <-- master (HEAD), dev
Now let's create a new commit I
, and then another one J
. Since HEAD
is attached to master
, that's the name that Git will update:
I--J <-- master (HEAD)
/
...--G--H <-- dev
Now let's git checkout dev
, which will refill our work areas from commit H
instead of commit J
and will attach HEAD
to the name dev
:
I--J <-- master
/
...--G--H <-- dev (HEAD)
Now we can make another two commits:
I--J <-- master
/
...--G--H
\
K--L <-- dev (HEAD)
This action, of creating new commits with the current branch name advancing as we go, is how we build up these branches. The names just hold the hash ID of each branch's tip-most commit. Tip is Git's technical term for this.
If we now git checkout master
to select commit J
and name master
, and then run git merge dev
, Git will build a true merge commit. The mechanism for making this merge involves finding the merge base commit—the point where the two branches were last "seen together", as it were—which in this case is obviously commit H
. We won't go into detail here, but the result is this:
I--J
/ \
...--G--H M <-- master (HEAD)
\ /
K--L <-- dev
The name master
now points to merge commit M
. M
has a snapshot as usual, but unusually, has two parents, J
and L
. When Git works backwards from M
, it goes to both commits (though one at a time). The history starting from M
and working backwards includes I-J
and K-L
, and working backwards from there, H
, G
, and so on.
Note that nothing else is different! The branch names still each point to just one commit. Each commit still has a snapshot. Each commit still has some set of parents. The only thing special about a merge commit is that it has more than one parent, and therefore, the history diverges at the merge (and in this case re-converges at the fork at H
).
Detaching HEAD
Once you understand the above, a detached HEAD is really pretty simple. Git just stores a raw hash ID directly in the name HEAD
, rather than attaching HEAD
to a branch name and getting the hash ID from the branch name. So we can have:
...--E--F <-- name
\
G <-- HEAD
If we make another new commit in this state, only the name HEAD
remembers it:
...--E--F <-- name
\
G--H <-- HEAD
If we now git checkout name
to get back on a branch, the new commit's hash IDs become hard to find:
...--E--F <-- name (HEAD)
\
G--H ???
What are the actual hash IDs of G
and H
? I have no idea, and neither do you. Can you find them? If so, maybe you can get those commits back, by using their raw hash IDs, or creating a new name pointing to them. If not, maybe not. For instance, if commit H
's hash ID is still visible in a window, you can cut-and-paste it into a git branch
command:
git branch recover-it <some big ugly hash ID>
which gives us:
...--E--F <-- name (HEAD)
\
G--H <-- recover-it
Branch names move, and can be created and destroyed as you like
So the summary of the above is:
- Each commit has its own unique hash ID.
- The last (or tip) commit of a branch is the one stored in some name.
- You can make new names at any time. The only constraint here is that you have to have the commit already—you must have some existing commit, whose hash ID you can find, or name by some other name.
- You can delete names at any time, though if this might lose your only way to find some commit hash IDs, be careful! (Git will try to be careful for you, requiring the
--force
flag or git branch -D
, in various cases. This code is a little tricky and has evolved over time; different versions of Git have different rules about what's easy or hard to delete.)
- Branch names move automatically when you create commits.
- The special name
HEAD
is usually attached to a branch name, but in detached HEAD mode, isn't.
There's one other key thing to note, though: Branch names are local to one particular Git repository.
Remote-tracking names
Again, branch names are local to one particular Git repository. If I've made a clone of your Git repository, I may have a master
and/or a dev
. These are my master
and my dev
; they are not your master
, nor your dev
. I can make new commits and change the hash IDs stored in my master
and my dev
, however I like.
But since I got all of my initial commits from your repository, all of my initial commit hash IDs match all of your Git's hash IDs. I might like to remember that your master
points to, say, a123456...
. That way I can see that since I cloned your repository, I've created two new master
commits.
The way my Git remembers your Git's branch names is that my Git has remote-tracking names.4 My Git has an origin/master
to remember for me where your Git had your master
, the last time I talked to your Git:
I--J <-- master (HEAD)
/
...--G--H <-- origin/master
Here I have made two new commits, that you don't have.
Of course, branch names move. Your Git may have new commits. They might be on your master
. So I can connect my Git to your Git by using git fetch
:
git fetch origin
It turns out that you, too, made two new commits. They have their own big ugly hash IDs but I'll just call them K
and L
. My Git obtains these two commits, and sees that your master
points to L`, so my Git now has this:
I--J <-- master (HEAD)
/
...--G--H
\
K--L <-- origin/master
Let's say I now run:
git merge origin/master
This creates a new merge commit M
, using H
as the merge base and J
and L
as the two branch tip commits:
I--J
/ \
...--G--H M <-- master (HEAD)
\ /
K--L <-- origin/master
Haven't we seen this before?
Scan up to the example where I merged dev
into master
. Yes, we've seen this exact scenario earlier. The only real difference was that I was using my own dev
; this time I'm using my remote-tracking name origin/master
instead of my branch name dev
.
So a remote-tracking name like origin/master
is just a name holding one commit hash ID. In that sense, it's just like a branch name. But there's already one big key difference. If I run git checkout origin/master
, I get a detached HEAD:
I--J <-- master
/
...--G--H
\
K--L <-- HEAD, origin/master
If I now run git merge master
I'll get a commit M
again, but like this:
I--J <-- master
/ \
...--G--H M <-- HEAD
\ /
K--L <-- origin/master
There's a second important difference to this M
, but I'll just footnote it.5
The last important thing to remember about remote-tracking names is that git fetch origin
updates all of your origin/*
names by default. That is, your Git calls up their Git. Their Git lists out all their branch names. Your Git ends the fetch
operation by updating your own memory of all of their branch names.
You can limit which remote-tracking names your Git updates. The git pull
command invokes git fetch
in a way that often does limit them this way. I recommend not using git pull
at all, but that is also a matter of opinion; you can use git pull
successfully, as long as you remember that it means run git fetch
with specific options, then run a second Git command with other options.
4Git calls these remote-tracking branch names. I don't like the extra word branch in this phrase, so I leave it out, but you should know that the Git documentation keeps it in there.
5This merge commit has, as its first parent, commit L
. Commit J
is its second parent. The order of the parents is sometimes unimportant—Git will follow history by going back along both paths—but sometimes important, because Git has a --first-parent
option to commands like git log
to tell it when you come to a merge, follow only the first parent. When and whether this first-parent property is useful to you depends on how you use Git, but it's worth noting that git pull
tends to make these kinds of swapped linkages: the commit that many people believe should be the first parent is the second, and vice versa. This is all a matter of opinion, but the opinion is strong enough to have Bitbucket provide an optional device to prevent what they call Foxtrot Merges.
git fetch
with a URL, vs using a remote
In all of the examples so far, we've used git fetch origin
or just git fetch
to obtain new commits from the Git that our Git gets when our Git calls up someone at the URL stored under the name origin
. But you're using:
git fetch <url> <branch-name>
This form of git fetch
is much older, dating back to 2005 or earlier, before the invention of the remote names like origin
.
When using this form of git fetch
, there are no remote-tracking names. Remote-tracking names like origin/master
are built by taking the remote's name, origin
, and their branch name master
. Without a remote name—with only a URL—there is no place to build a remote-tracking name.
In the bad old days before remote names, we had to do things this way. By running:
git fetch <url> <branch-name>
we limited our git fetch
to calling up the Git at the URL and asking only about that one branch. Their Git would say, e.g., OK, my develop
is a123456...
and hand over commit a123456...
and any earlier commits we needed to give our Git the ability to walk backwards from a123456...
to a root commit.
So, we now would have:
...--G--H <-- master
\
K--L [a123456...]
We had to stash the actual hash ID of commit L
, whatever it might be, somewhere. The somewhere is the special name FETCH_HEAD
.
Note that FETCH_HEAD
is overwritten by each git fetch
command. The name FETCH_HEAD
suffices to find commit L
right now, after the one git fetch
we just ran, but it won't last. The next git fetch
will overwrite FETCH_HEAD
and if we don't save L
's hash ID somewhere quickly, we'll lose it and never be able to find it in the haystack of random-looking hash IDs inside a Git repository.
So that's why you have this second command:
git checkout -b <branch> FETCH_HEAD
The name FETCH_HEAD
serves the same function as a remote-tracking name, but only until the next git fetch
. We use the poor substitute for a remote-tracking name to create a local branch name pointing to that very same commit:
...--G--H <-- master
\
K--L <-- branch
That's our branch name, which we are now free to do with as we will.
This kind of clutters up our repository. It's a much better idea to create a remote for the fork:
git remote add xyzzy <fork_repo_url>
(pick a better name than xyzzy
, of course). Now we can run:
git fetch xyzzy
and acquire or update our remote-tracking names xyzzy/master
, xyzzy/develop
, xyzzy/feature/tall
, and so on. We don't have to worry about dealing with the FETCH_HEAD
file that gets overwritten on each new fetch—including a fetch to origin
—that could mess with our memory of hash IDs from the fork-repo.