First, let me suggest that you read the Pro Git book, because most of the documentation that comes with git is pretty bad, especially for newbies. (It's improved from where it was 5+ years ago, but it still leaves much to be desired.)
That said, let me try a quick introduction to "what you need to know" here.
Introductory stuff
- When working with other people (including github which counts as "other people" here), you're having your git call up their git over the internet-phone and exhcange work. These other people are peers, i.e., they have the exact same ability to do new work that you do, at least in the general case (github won't itself do new work all on its own). What this means is that during an exchange, you might have things "they" (whoever they are) don't but they might also have things you don't.
- Git exchanges work with individual commits.1 Exactly which commits, and how, gets a bit more complicated, so let's hold off on that for a moment.
- The opposite of
git push
is not git pull
, it's actually git fetch
. I think the error message you get should not refer directly to git pull
; we'll see more on this in a bit as well.
- Git keeps track of commits internally by their "true name" SHA-1 IDs, which are the long 40-character-hex things you see in
git log
output, for instance. Every commit has some set of "parent" commits, identified by SHA-1 IDs. You could use SHA-1 IDs yourself, but they're extremely human-unfriendly, you'd never remember whether you should start with f13c9ab
or 0e1d227
or whatever (and these are abbreviated IDs!).
- So, git therefore provides you, the user, with branch names like
master
. A branch name is simply a name that you tell git to use to keep track of the newest commit you've made on that branch.
- The newest commit has, as its parent, the commit that "used to be newest", which has an even older "newest" as its parent, and so on. (The way git achieves this is remarkably simple: when you make a new commit, its parent is "the current commit", and once the commit is safely stored in the repository, git replaces the ID stored in the branch with the ID of the new commit just made.)
Given the last three items, let's draw a commit graph fragment:
... <- 1f3a33c <- 98ab102 <-- branch
Here, the tip of the branch is commit 98ab102
and its parent is 1f3a33c
. We can say that the name branch
"points to" 98ab102
, which points to its parent, and so on.
When you make a new repository (git init
, assuming this actually makes one2) it's entirely empty of commits, so the first commit has no parent. It's the second and third and so on commits that point back to previous commits.
When you do a "merge", you tell git to take two (or more,3 but let's just say two, for now) existing commits and make a new commit that has both of those as parents, by combining all the changes from some common point. Merges are not that complicated, but they're something you should generally do after deliberately dividing up work into separate lines of development.
Now, back to fetch, push, and pull
You've created a new repository (git init
), put in at least one commit (git commit
), and added a "remote" (git remote add origin ...
). This last step says "I have a peer git I want to call origin
, so remember the URL under that name."
Then, you asked your git to call up "origin" (git push ... origin
) and give it your commits as found under your branch-name "master" (git push ... master
). Here's where it gets a little bit tricky. When your git talks to his git, what name should his git use? The short answer here is that his git will use the name master
too. This is changeable, but changing it is not what you want here.
(You also asked, via the -u
flag, to have your git record this exchange in your git configuration. We'll leave that aside for now.)
When your git called up his git, your git said "here, look at these shiny new commits" (and he did), and then your git said: "Why not add these new commits exactly as they are to your repository,4 and make your master
point to the tip-most of those commits?" His answer was: "Well, I could do that, but then I'd lose some commits I have that you don't." This is where you see the rejected
message.
Maybe you want them to forget their stuff-so-far. If so, you can ask for a "force push", where you have your git tell them to set their master
even if that "loses" some commits. It's still up to them whether to do this, but often it's not what you want.
Maybe you want to pick up what they have and add it to your collection, or pick up what they have and toss out your own work-so-far. This is where you want git fetch
.
What git fetch
does, de-complicated as much as possible,5 is call up the peer over the Internet-phone and find out what they have that you don't. It brings over all these commits (remember, the exchanges go by commits) and adds them, Borg-like, to your repository. Then—this is the crucial part—it changes their branch-names so that they won't interfere with your branch-names.
The names that git fetch
uses to synchronize with your peers are called "remote-tracking branches" or sometimes "remote branches". One odd thing about "remote branches" is that they're not actually on the remote! They're kept in your own repository. The reason is simple: they're a snapshot of what was on the remote, the last time your git talked to that remote. After the two gits hang up the Internet-phone, the remote could get changed. (How fast a remote actually changes depends, of course, on how busy that remote git is.)
The renaming-pattern here is simple: take the remote's branch name (like master
) and add the name of the remote (origin
) in front: your origin/master
"tracks" origin
's master
. (There's a full-name form that you can use if you accidentally name one of your own purely-local, not-remote-tracking, branches origin/oops
for instance, but your best bet is not to do that in the first place.)
Wait, where does git pull
come in?
Note that so far, it's been all git push
and git fetch
. But suppose you've done a git fetch
and picked up their work? Now you have a bit of a problem: you have your work, and you have their work, and these two have diverged.
If you both started from a common base, we can draw your commit graph like this:
... base <- yours <-- HEAD -> master
\
theirs <-- origin/master
I put in the HEAD ->
this time to show which branch you're on (run git status
and you should see "on branch master" in its output).
One easy way to tie these two divergent bits of work together is to use git merge
. If you want to do that, you simply run git merge origin/master
(note the slash: you want your git to find your origin/master
, which it picked up during git fetch
, and merge that into your master
, making a new commit on your master
). The result looks like this:
... base <- yours <- merge <-- HEAD -> master
\ /
theirs <-- origin/master
Another way to handle this—usually the better way, in fact—is to use git rebase
, which I won't go into in detail here; there are plenty of over StackOverflow answers about using rebase
. If you do, though, you wind up with:
... base <- theirs <- yours <-- HEAD -> master
\
.......... <-- origin/master
(Note that in all cases, origin/master
still points to the tip-most "theirs" commit.)
The main point to keep in mind right now is that whatever you do, if you want to get their git to accept your commits without having to force-push, you need to make your work be an add-on to theirs, so that your new commits "point back to" theirs. It doesn't matter to the push
process whether your work has theirs as a "second parent" of a merge commit, or simply builds directly on their commit(s); it only needs to point to their commits, by their commit-IDs. (But again, rebase is usually better than merge!)
So, git pull
(finally!)...
What git pull
does, simplified, is just run git fetch
and then git merge
. In short, it's meant as a convenience script: you're always fetching and then merging, so we'll give you a script that fetches, then merges.
Of course, git merge
is usually wrong. It really should use git rebase
by default.
You can make git pull
use git rebase
,6 but I think it's wiser, at least (or especially?) for newbies, to use the two separate steps—in part because if something goes wrong, the way you recover from this is different for the two different actions. To get out of a failing merge, you use git merge --abort
. To get out of a failing rebase, you use git rebase --abort
. (These used to be even more different, now it's just "abort whatever's failing", which is a big improvement. But you need to know which to do, and that's an awful lot clearer if you started with git merge
or git rebase
.)
The bottom line
In the end, the action you need to take here depends on what you want to have happen, and what the remote will allow you to do. If you want to drop their stuff, use git push -f
(force), being aware that you're causing them (whoever they are) pain this way, and that they might forbid it entirely. If you want to keep their stuff, use git fetch
first, then keep their stuff however you prefer (merge, rebase, rework, whatever).
1Or with patches, which you can exchange with your peers by email. It's also possible to "bundle" commits and transfer them by other methods. But in this case we're doing commit-based exchanges over the Internet-phone.
2You can safely run git init
in an existing repository. It has a minor effect but for our purposes here it basically does nothing, hence the need to say "assuming it actually makes one".
3Git calls this an "octopus" merge, even if there are just 3 or 4 parents instead of 8.
4Git mostly only ever adds "more stuff". I like to refer to this as the "Git Borg", where git adds your technological distinctiveness to its existing collection.
5A common theme in git is "That wasn't quite what you wanted? OK, we'll keep the existing stuff but make it more complicated so you can do what you want too!"
6See footnote 5.