4

It was okay to not understand the basics of git when I was working by myself, but now that I'm working with another person and we each submit pull requests to have them merged by the other, it's starting to be a problem.

The workflow: I write in my "author" branch. When that's ready to be reviewed, I submit a pull request, and my editor merges that into master. When she has comments for me, she submits a pull request of her Editor branch, and I merge them into master.

Today, I'm getting a completely infuriating circular error and I don't understand what I'm being asked to do.

thomas@trigger ‹ author ↑● › : ~/pm/wip [1] % git push To https://github.com/mathpunk/punk-mathematics-text.git ! [rejected] Editor -> Editor (non-fast-forward) ! [rejected] author -> author (non-fast-forward) error: failed to push some refs to 'https://github.com/mathpunk/punk-mathematics-text.git' To prevent you from losing history, non-fast-forward updates were rejected Merge the remote changes (e.g. 'git pull') before pushing again. See the 'Note about fast-forwards' section of 'git push --help' for details.

So, I try git pull and get this:

thomas@trigger ‹ author ↑● › : ~/pm/wip
[130] % git pull
You asked me to pull without telling me which branch you
want to merge with, and 'branch.author.merge' in
your configuration file does not tell me, either. Please
specify which branch you want to use on the command line and
try again (e.g. 'git pull <repository> <refspec>').
See git-pull(1) for details.

If you often merge with the same branch, you may want to
use something like the following in your configuration file:
    [branch "author"]
    remote = <nickname>
    merge = <remote-ref>

    [remote "<nickname>"]
    url = <url>
    fetch = <refspec>

See git-config(1) for details.

Questions related to this problem are frequent on stackoverflow, but apparently I don't understand the basic concept of git well enough to understand them. Also maybe our workflow is insane: neither of us are developers. What can I try next?

tom
  • 541
  • 1
  • 5
  • 16
  • 1
    Are you using Git from the command-line only, or are you using any graphical interface for it as well? Using any of the tools that allow you to visualise the repository may make it easier to see what is going on, to see the commits are on the remote branches that are missing from your local branches, and vice versa. If you are using the command-line only, you can get something similar with `git log --decorate --all --graph`. –  May 07 '14 at 22:15
  • I've just been using the command line, except today when I was banging on the keyboard trying to figure it out and downloaded four GUIs in rapid succession hoping I'd grasp one. The decoration flag sounds intriguing, I'll check it out. Thank you! – tom May 08 '14 at 03:02

1 Answers1

5

This is one of those areas where I think the git documentation is really terrible. Everything points you to git pull but git pull is just a convenience method built on top of several underlying items, and it's the underlying items that are critical to understanding this. Git lets the critical-understanding parts "leak out" while trying to pretend that they are irrelevant.

That out of the way, here are the actual basic elements:

  • When you are sharing stuff with someone, there are multiple independent repositories (repos) involved. At a minimum there is "yours" and "theirs", and very often there is a third (e.g., on github) that you both use to accomplish the sharing. In this case, the model is that this third party—github—handles all the complicated authentication stuff (https, ssh, whatever), so that you don't have to.

  • To obtain stuff from some other repo, you use git fetch.

  • To deliver stuff to some other repo, you use git push.

In other words, the opposite of push is not actually pull, but rather, is fetch.

refspecs

To make these two operations (push and fetch) work, git uses what it calls "refspecs". Remember that when pushing and fetching, there are two repositories involved.

Refspecs usually just look like a single branch name. However, the simplest "real" version of a refspec is actually two branch names separated by a colon:

master:master
Editor:Editor
author:author

The left and right hand sides name the branch in the two repos. For a push, the name on the left is the branch in your repo, and the name on the right is the branch in their repo. For a fetch, the name on the left is the branch in their repo, and the name on the right is the branch in your repo.

Here's where the model gets a bit strange again, and non-symmetric. Git believes (as much as it can be said to believe anything) that when you fetch, you and they may both have been doing work; but when you push, only you should have been doing any work. (There are good reasons for this, which I won't go into as this is already very long. :-) )

To make this all function, fetch provides for branch renaming. Instead of fetching "their" branches (master, author, etc) directly to "your" branches—this would make it terribly difficult to access the work you did since the last fetch—you fetch "their" stuff to what git calls a "remote branch".


fetch and "remote branches"

A "remote branch" is in fact a local thing, despite the name "remote branch". So is a "remote", for that matter. A "remote" is a name, like origin or github, that you configure locally. Associated with this "remote" is a URL, such as https://github.com/mathpunk/punk-mathematics-text.git. There is also a fetch line. Don't worry for now about the mechanics of the fetch line (once it's created it normally Just Works); just know that this is how git knows what "remote branch name" to use when fetching.

You do have to worry, to some extent, about the actual name of the remote. The usual default name is origin but you choose the name when you do a git remote add command. The remote's name becomes part of the "remote branch" name. Specifically, the remote name is prefixed in front of the branch name.

Thus, assuming you will git fetch origin to bring stuff over from github, the "remote branch names" will be origin/master, origin/Editor, and origin/author.

If you git fetch github to bring stuff over from github, the "remote branch names" will instead be github/master, github/Editor, and github/author.

In all cases, you just name the remote, and fetch brings over all branches, but renames them. By leaving out the refspec, you use the default one from the fetch line.

If you add a branch name (git fetch origin author, for instance), git turns this into a "real" refspec by using that same fetch line to rename the incoming branch. In effect, git fetch origin author turns into git fetch origin author:origin/author. Their branch name, on the left, author turns into your "remote branch name", origin/author, on the right.

(The idea here is that you can add multiple different remotes. If you, your editor, and your publisher all want to share directly with each other, rather than with a third party like github, you could have two remotes, named editor and publisher for instance, and you would get "remote branch names" like editor/Editor for one remote, and publisher/Editor for another. If you use a single sharing site like github, all of this is pointless complication, though.)


OK, back to fetch and push. When you git fetch origin, you use your remote origin name to bring over "their" branches but put them under your origin/* "remote" branches. That keeps their work separate from your work. (Of course at some point you need to combine these; we'll get to that in a moment.)

When you push, though, "push" does not use the concept of a "remote branch". You simply push directly to their branch. So if you have some changes in your repo, in your branch author, and you want to push those, you just git push origin author:author. The origin part here is the remote name again, and the last part is a refspec as usual, naming your branch (author) and then their branch (also just author).

If you include a branch name on your push command, the lack of branch renaming here shows through: git push origin author "means" git push origin author:author. Your branch name, author, on the left, is simply copied over to be used as their branch name, author, on the right.


Review

Time for a quick review:

  1. You set up a "remote"
  2. which you use to fetch into your "remote branches", and
  3. which you use to push your local branches to their local branches.

Think about that for a moment. Notice that there's a step missing.

How do you get their work, which (after step 2) is now listed in your remote-branches, into your own local branches?

This is where git merge, and hence also git pull, come in.

This is also where the item in your question title comes in. Fast-forward, or non-fast-forward, is a property of a "label move", in git.

To really understand this, we have to take another little side trip, discussing git's model for commits and branches.


Commit graphs

Every commit has a guaranteed-unique identifier (the SHA-1, the 9afc317... number). Neither your nor anyone else will ever create any different commit that has that number, but if you or anyone else can manage to recreate that commit exactly, you will get the same number. (This is important for fetching.)

Each commit also contains—indirectly, by reference—a complete stand-alone entity, the "tree". The tree is the set of all files in that commit. The commit, however, is not quite stand-alone: it has one or more "parent" commits. These determine the commit history, and thus "build up" the actual branching structure.

(In many—maybe even most—other version control systems, the tree is not stand-alone: the VCS has to grind through parent and child commits to extract the tree, and/or to make new commits. But in git, each tree is independent; it only has to go through parent/child sequencing to compare two trees, or to determine commit history.)

Given a commit, git finds its parent commit(s), and their parent(s), and so on, and builds up a "commit graph":

        C - F
      /       \
A - B           G   <-- master
      \       /
        D - E

This is a graph of a repository containing 7 commits in all, all on one branch named master. A is the initial commit (A here stands for some big ugly unique SHA-1 number), B has some change since A (comparing the two trees for A and B will show the change), and then someone—or maybe two "someones"—did what would be called "branching": created commit C based on commit B, and created commit D also based on commit B.

After that, someone created commit E based on D, and F based on C.

Finally, someone combined the two branches to make a merge commit, commit G. Commit G has as its (two) parents, both F and E. The fact that it has two parents is, in fact, what makes it a "merge commit".

When all this happens in a single repository, it's straightforward enough. The one "someone", the person using the repository, made commits A, B, and C on branch master, then perhaps created a named branch starting from commit B:

git checkout -b sidebranch master~1

and made commits D and E. Then they went back to master:

git checkout master

and made commit F, and then ran:

git merge sidebranch

to create commit G. After this they could delete branch sidebranch, as commit G (now the tip commit on master) points back to commit E as well as to commit F.

This same pattern, however, occurs when both you, working in your own repo, and "they", working in theirs, make commits. Let's say you're working on master and you've made commits A and B:

A - B   <-- master

At this point you push your work to the sharing point (github), so that it has A and B. They clone this repository, giving them a third repo, with the github sharing point and the two commits A and B.

Now you work within your repo and create commit C. They work within theirs and create D and E, and before you push C to github, they push their D and E to github:

[you:]

        C   <-- master
      /
A - B

[them, and github:]

A - B
      \
        D - E   <-- master

At this point, let's say you use git fetch github. Remember, fetch renames "their" branch, so the result is this:

        C   <-- master
      /
A - B
      \
        D - E   <-- github/master

Git can do this because each commit has a unique SHA-1, so it knows that your A and B and their A and B are the same, but your C is different from their D and E.

At this point, you can create commit F, which makes your master point to your newest commit:

        C - F   <-- master
      /
A - B
      \
        D - E   <-- github/master

Now if you want to share your work, this is when you'd git push github ... but the problem is, your master has commits A - B - C - F, while commits D and E are, from your point of view, only on github/master.

If you push your master to github and make github's master point to commit F, commits D and E will be lost. ("They", whoever they are, will still have them, and you will still have them but named github/master, so it's possible to fix this, but it's a pain.)

The solution is for you to patch this up so that "their" commits, D and E, are also on your master. One easy way to do that is for you to merge your work and theirs, giving:

        C - F
      /       \
A - B           G   <-- master
      \       /
        D - E   <-- github/master

Fast-forward

Notice how your branch label, master, has "moved forward" every time you made a new commit?

You made commit F and master, which used to point to commit C, moved forward to point to the new commit, F.

Then, you made merge-commit G and master, which used to point to F, moved forward to point to the new commit, G.

The label "moves forward" along the branch as you build it.

Suppose we have another label—another branch name—pointing to (say) commit B, all along, that we have not yet moved:

      ..............<-- br
     .
    .   C - F
    v /       \
A - B           G   <-- master
      \       /
        D - E

We can now ask git to "slide label br forward", and to do it "fast"—all at once, all the way to commit G:

git checkout br
git merge --ff-only master

When we ask git to do a merge, if we tell it --ff-only (fast forward only), it will see if there's a way to slide the label forward from whichever commit it points to now, to the target commit, in this case G. (The name master points to commit G so merge picks commit G as the fast-forward target.) In this particular case there are actually two ways to do it, B-C-F-G or B-D-E-G; either one suffices to allow this "fast forward".

(With --ff-only, if the branch label can't be fast-forwarded, the merge request is simply rejected. Without --ff-only, git will attempt create a new, actual merge commit, so that the label can be moved forward. And with --no-ff, git merge will create a merge even if it's already possible to do a fast-forward. The default, with no options at all, is to fast-forward-if-possible, else make new merge commit.)

push requires the fast-forward property

If you ask git to push our new master, it's allowed, because this meets the "fast-forward" test. When we do our push, we'll tell github: "Please take commits C, F, and G, and then move your label master (which we're calling github/master) from commit E to commit G". Is there a path from E to G? There is, so it's allowed.


pull

All git pull does, really, is run git fetch, and then run git merge.

Unfortunately, that means you really do need to understand all of the above to really understand git pull.

There are several large wrinkles here though. First, I've been using git fetch origin and git fetch github above. In other words, I keep naming a remote. Where does the remote come from, when you do git pull?

The answer is that it comes from your configuration. Each branch in your repository can name a remote:

$ git config branch.author.remote github

Now the "remote" for branch author is github.

Second, if you run git merge, you have to tell it what to merge. Where does the merge name come from, when you do git pull?

The answer is, again, that it comes from the configuration. Each branch can name an upstream merge branch:

$ git config branch.author.merge author

Git combines the merge with the remote, so that after these two git config commands, git pull essentially does git merge github/author.

I say "essentially" because there's yet another wrinkle: in older versions of git, pull runs fetch in such a way that it doesn't update the remote-branch names. Instead, it uses a special FETCH_HEAD file. (In newer versions of git, it still uses FETCH_HEAD but it does update the remote-branch names too.)

Last, there's a very big wrinkle: you can configure git pull to use git rebase instead of git merge. But this answer is now complete enough; I'm not going to get into those details.

torek
  • 448,244
  • 59
  • 642
  • 775
  • 2
    +1. Note: on the complexities around `git pull`, that is why there is a `git update` proposition floating around these days...: http://www.spinics.net/lists/git/msg230598.html – VonC May 07 '14 at 23:46
  • Wow. Can I submit a pull request to get this answer into git's documentation? I'ma study it in detail. Thanks so much. – tom May 08 '14 at 03:03