Your question comes in two parts:
How do I create a branch on the server and then push my non-branched local commit to the new branch assuming that the master head has not moved since I cloned it?
If the master head did move since cloning, how would I branch off the previous master commit on the server and then push my changes there?
but in fact, it reveals the usual confusion everyone has with Git, which is the difference between branch and branch—or maybe, branch, branch, and branch. Git uses one word to mean at least two different things. (After which, the distributed nature of Git complicates the complications.)
What is a branch anyway?
Let's take on question 1 first and identify the different things we might mean by "branch". We can use the phrase branch name for one of them. A name like master
is a branch name, and a branch name points to one specific commit.
We have not really defined commit yet either, but you have a pretty good idea what commits are about, by this point. Just in case, though, let's firm it up. A commit is a Git object that:
- represents a snapshot of the source;
- carries an author name-email-timestamp, to remember who wrote the code that went into it; and the same for a possible second person, the committer, though these days the two are most often the same;
- carries a log message; and
- identifies some parent commit(s).
This last bit is critical and we'll come back to it in the next paragraph. For now, let's note that each commit has its own unique hash ID, 8e2df81...
or ac0ffee...
or badbead...
or some such. A branch name like master
is a human readable name for one—and only one, or rather, one at a time—of these hash IDs. We call this the branch tip, or the tip commit of the branch, but this last phrase is a bit unsatisfying: if master
is the branch, how can master
be the name of the tip of the branch? What is the actual branch, then?
Let's imagine a repository with just a few commits in it. Let's start with just three, and instead of big ugly hash IDs, let's use single-letter names for writing them down. The first commit we ever make is commit A
. Because it is the first commit, it cannot have a parent commit. There's a special name for this parentless commit: it's called a root commit.
Now we make a second commit and call it B
, storing A
's actual hash ID in it as its parent. We say that B
points to A
. We make C
with parent B
and say that C
points to B
. To keep track of it all, we make the name master
point to C. Let's draw this:
A <- B <- C <-- master
All of Git's internal arrows are backwards like this—the name master
finds the last commit, which finds an earlier commit, and so on—but it's too much of a pain to draw this way (at least in text on StackOverflow), so let's just use connecting lines from now on:
A--B--C <-- master
To add a new commit, we compute its hash ID (really some big ugly hash, but we'll just call it D
) from its contents: the snapshot, the author and committer, the log message, and its parent ID C
. We write this commit into the repository database:
A--B--C <-- master
\
D
and then we move the name master
to point to the new commit D
:
A--B--C [C used to be master]
\
D <-- master
This is how branches grow, by adding new commits—this never touches any existing commit at all—and then changing a branch name to point to the new branch tip. So the other necessary meaning for branch, whatever other name we might give it—I've suggested the name "DAGlet"; see What exactly do we mean by "branch"?—is some subset of the commits found by starting at the branch tip and working backwards towards the root.
This is where the initial confusing bits come in, because Git keeps just using one word for all of these things: a DAGlet like A-B-C-D
is "a branch", the tip commit D
of branch master
is "a branch", and master
itself is "a branch". But they are all different things. Git just expects us to know which thing we mean—and after a bit of practice, you really do just know, at least most of the time, but it's nice to have the words branch name, branch tip, and DAGlet when we need to separate them.
Creating new branch names and dealing with different repositories
So, now we can look closer at the first half of the first question: you need to create a branch name on the server. We can also look at second question at the same time: you may want to create a branch name in your own repository.
Git, being Git, offers two standard user-interface ways to create branch names in your own repository: git branch
and git checkout -b
. We already saw that a branch name just points to one specific commit. You made a new commit D
and Git made your own master
point to this new D
:
A--B--C
\
D <-- master
I drew D
on its own line originally because I wanted to leave room for the note I had on C
. Let's leave that space around to add more names pointing to C
, and just for the heck of it, let's make a new name, old-master
, pointing to C
now. The way we do that is with git branch
:
git branch old-master <hash-ID-of-commit-C>
which results in this:
A--B--C <-- old-master
\
D <-- master
Remember, though, this is all happening in your repository, with your branch names. There's another Git out there, with another repository, on the server. That Git still only has A-B-C
, and his master
still points to C
. Here's where Git gets clever: your Git remembers, for you, where his Git had his names, using names exclusive to your repository. These names are your remote-tracking branch names. (Oh look, another kind of branch!)
Remote-tracking branch names are prefixed with the name of the remote, which is usually origin
. So origin/master
is how your Git remembers, for you, that their Git has their master
pointing to commit C
. Let's draw that in:
A--B--C <-- old-master, origin/master
\
D <-- master
Note that we don't tell our Git to make the origin/master
name: it does that on its own, by finding their master
and adding origin/
in front for us. When we connect our Git to their Git, our Git can get any updates from them and move the origin/*
names around as needed. This is what remote-tracking branches are all about: they remember some other Git's name-to-hash-ID mappings for you.
Note also that all of this faffing-about with names has, up until now, the simple goal of identifying one specific commit. If we want a branch tip, once we find the commit, we're done. If we want a DAGlet—some set of commits found by working backwards through parent IDs—we're pretty good here too, because the tip commit has the parent ID, and that parent has another parent ID, and so on. (We might need to tell Git when to stop going backwards, but I'll leave that for other SO postings.)
But, now we need to tell our Git to ask that other Git to make a new branch name. With our Git we used git branch
or (not yet shown here) git checkout -b
, but that doesn't tell their Git anything. Aside from actually logging in on the server—which would work!—we need a way to have our Git call up their Git and ask it—the server—to do this; and the way we do that is with git push
.
When we run git push
, we give it:
- the name of a remote, like
origin
: look up a URL and use that to call up another Git on the Internet-phone, and
- some set of refspecs.
Oversimplifying a lot, the refspecs are basically branch names, but paired with a colon between them:
git push origin master:blah
The name on the left is our name, and the name on the right, blah
, is the name we would like their Git to update, or re-set, or create. Our Git calls up their Git, has a little conversation to find out what commit(s) we need to send them—ones we have that they don't, like our new commit D
1—and asks them to set their name. If we don't use the :blah
part, our Git uses our name: "Please set your master
the same way I have my master
set, to point to this new commit D
I give you."
1The way Git figures this out is to do those DAGlet actions. If we'd added D
and E
and F
, our Git would work back from F
to E
to D
to C
. On reaching C
, their Git would say: "stop, I have that one!" and our Git would send over the D-E-F
chain.
But wait, there's more! Or is it less?!
We can now completely answer question 1, which is also the complete answer to question 2! Let's say we want to give the server our new commit D
and have them set a new branch name, test
:
git push origin master:test
This sends them commit D
and then ask them to set their test
, which is new to them, to point to D
. Their master
probably still points to C
, so they get this:
A--B--C <-- master
\
D <-- test
Our git push
transfers the DAGlet first, then asks them to set a name, and we didn't ask them to set their master
. So suppose that their master did move, as in your question 2. They have:
A--B--C--E <-- master
to start with. We send them D
, our new commit—remember, every commit has its own magically unique2 hash ID, so our D
has a number that's different from their E
—and ask them to set the branch name test
, which gives them:
A--B--C--E <-- master
\
D <-- test
2The magic lies in using a good hash function. Git currently uses SHA-1, which was good enough, but now ... well, it's still good enough, but getting a bit creaky.
But this leaves us with a bit of a problem of our own
That one git push
command does the whole job for the server, but it leaves our repository all messy. I'm going to drop the name old-master
since we have origin/master
.
A--B--C <-- origin/master
\
D <-- master, origin/test
We probably don't want this: we'd like to have our own test
too, to match this new origin/test
our Git just created to remember their test
. So now we can use git checkout -b
, or git branch
, or a bit of special Git magic with plain old git checkout
:
git checkout test
Note that we don't have a branch named test, or at least, not yet. This checkout command is about to fail! But just before it does, Git says to itself:3 Wait, maybe you meant to make a new branch based on an origin
branch! It checks around, and sure enough, there's an origin/test
now,4 so it makes a new test
pointing to the same branch-tip commit, and as a side effect, sets the new branch's upstream to origin/test
:
A--B--C <-- origin/master
\
D <-- master, test, origin/test
Now we just need to fix up master
to point back to commit C
, like it does5 on the server. Git being Git, there are multiple ways to do that too, but the one for beginners to use is git reset --hard
, which is a bit unfortunate as git reset --hard
is a little bit dangerous, and requires that you first switch back to branch master
:
git checkout master && git reset --hard origin/master
This first puts your Git on branch master
, as git status
will say, and then changes the current branch—the one you're on branch
—to point to the same commit as origin/master
:
A--B--C <-- master (HEAD), origin/master
\
D <-- test, origin/test
Note that I added this (HEAD)
to the drawing: that's to remember which branch we're on. Now that there are two, master
and test
, we always need to know which one is going to get moved by git commit
and git reset
and so on.
I've completely skipped git checkout -b
here, but what it does is combine git branch
with git checkout
. It creates a new branch name, like git branch
would; and it switches to that new branch, like git checkout
would. If anything goes wrong in all of this process, it magically manages to skip creating the new branch name. Internally, it does this by "cheating": it does the commit-level checkout step first. If that succeeds, now it creates the branch name and changes which branch you're on. When you do this manually you're forced to do it in the more sensible order: give the commit a name first, then check out the name-that-names-the-branch-tip.
You can mix these various steps up into various other orders, and if you remember to make a new branch first, git checkout -b
is the way to go. But the key to keeping all of this straight is to remember that it's the graph—the "DAG", made by drawing commits and their parents—that's the permanent part of the equation, while the branch names like master
and test
are just labels that you can shuffle around all you like. You need the names to find the commits, because the hash IDs are too unwieldy.
3Don't anthropomorphize computers; they hate that.
4This assumes your Git is version 1.8.4 or higher.
5Or does it? What if there's that E
commit? See the next section.
Picking up that new commit E
Going back to question 2:
If the master head did move since cloning, how would I branch off the previous master commit on the server and then push my changes there?
we already saw that you don't have to pick up the new commit E
at all, to get your own new commit D
put onto a new side-branch. But you might want to see commit E
.
You can do this at any time by running git fetch
. Git's fetch
—not git pull
; I recommend avoiding git pull
—is as close as Git gets to the opposite of push
. Like git push
, it takes the name of a remote, and some optional set of refspecs, but usually you just name the remote, or let Git figure it out for you since there's probably only one remote anyway:
git fetch
This calls up the other Git (using the URL from the remote as usual) on the Internet-phone, but this time, instead of your Git sending their Git your commits, your Git has their Git send you their commits. Just as before, your Git stops them as soon as they reach the DAGlets you already have, so this only gets new commits (well, plus any other data needed to complete those new commits). Then your Git remembers their branch names by updating all your remote-tracking branch names. So, if they do have commit E
and you run git fetch
, your graph goes from:
A--B--C <-- origin/master
\
D [whatever name(s) you have here at this point]
to:
A--B--C--E <-- origin/master
\
D [whatever]
It's always safe to run git fetch
, because git fetch
only adds new commits to your repository. It does not—can not—change any old ones: the hash ID of a commit is a cryptographic checksum of its contents, so any attempt to change anything results in a new, different hash ID. Instead, it adds any new commits that have appeared, then changes your origin/*
names to match what's in the other Git.
(I will note, as an aside here, that git push
can see that their origin/master
points to commit E
, but it doesn't change your origin/master
. The reason is that git push
is sending things, not receiving them. It will update or create remote-tracking branch names for branch-name create-or-updates that you send to them and that they then accept, but it won't update any others. This is a design choice made for a bunch of reasons, including the fact that Git won't let you set a name to point to a commit you don't have. If you don't have their E
yet, you can't set your own origin/master
to point to it.)