"Best practice" is a matter of opinion and therefore off-topic on StackOverflow. More important here is that you understand what these various options do, so that you can pick whichever one suits you the most.
I have only 1 commit that I would like to merge neatly into master (using fast-forward merging).
In this case (and given your setup steps), git rebase
followed by git merge
will work more often than just git merge
. That's because sometimes—perhaps most times, depending on who else you might be working with here—the git rebase
will do nothing. In the cases where git rebase
does something, the thing it does will be necessary.
But your example command sequence is a bit different:
git checkout master
git pull
git checkout feature
git rebase master
What's that git pull
doing in there? Well, we'll get there, because git pull
means run git fetch
, then run a second Git command, and that second command itself is either git merge
or git rebase
. This means you need to understand at least one of git rebase
and git merge
, depending on which second command you pick. You also need to understand git fetch
.
TL;DR
The following is unavoidably long, because Git is a bit complicated, but there is a short version:
If you like, you can use git merge --ff-only
and just let it fail in cases where it can't be used. That tells you: stop, step back, take a close look at what you have now, and decide if you want to rebase, or merge, or do some longer sequence of operations. See bk2204's accepted answer that went in while I was writing all this...
(I also suggest that Git newbies avoid git pull
, so that they know exactly which commands they're running. Once you're familiar with the pieces, then you can use git pull
as a convenience, if you find it convenient. When I was new to Git, back in 2005 or 2006 or so, I found that git pull
made it unclear what's really happening. Another reason to avoid git pull
is that what the second command it runs does depends on what the fetch
does. You have to know in advance what git fetch
is going to fetch! Well, that, or not really care, and not-really-caring is actually kind of common.)
A fast recap of things you may already know, but with pictures
Each commit, in Git, has a unique hash ID. That hash ID is in a sense the "true name" of the commit. No other commit, not even one in some other Git repository, can ever use this commit's hash ID for a different commit. No past or future commit can re-use this ID: it's for this commit, whichever commit "this commit" is.
Every commit stores two groups of things: the data part, which holds a snapshot of all of your files—not changes since the previous commit, just a snapshot—and the metadata part, which holds things like your name and email address, date-and-time stmaps, log messages, and so on. One of the metadata items of every commit is a list of hash IDs of previous or parent commits. Most commits have exactly one previous / parent commit.
A branch name like master
holds the hash ID of one commit. That commit is to be considered the last commit in the branch. That's all it has in it, just the one hash ID. It's the commits themselves that remember who comes "before" them.
Whenever something holds a commit hash ID, we say that this thing points to that commit. So we can draw all of this like so:
... <-F <-G <-H <-- master
\
I <-J <-- feature
The name master
here points to commit H
, so that H
is the last commit in that branch. The name feature
points to commit J
, so that J
is the last commit in that branch. Commit I
is in feature
, because J
points back to I
. Commit H
is in the feature branch too, though, because I
points back to H
.
In other words, commits can be on more than one branch at a time. If we create a second branch from master
and add a few commits there, we get:
I--J <-- feature
/
...--F--G--H <-- master
\
K <-- feature2
(I always get lazy and stop drawing in the backwards-pointing arrows between commits, because it's too annoying. Just remember that the "lines" connecting commits are really backwards-pointing arrows from the later commits to the earlier ones.)
In this case, commits up through H
are on all three branches, while the I-J
set are only on feature
and K
is only on feature2
.
To make a new commit, we git checkout
whichever branch name we want the new commit to be "on". That selects the commit and attaches the name HEAD
to the branch name. So, if we run git checkout master
, we get:
I--J <-- feature
/
...--F--G--H <-- master (HEAD)
\
K <-- feature2
We now have commit H
out. If we run git checkout feature
, we get:
I--J <-- feature (HEAD)
/
...--F--G--H <-- master
\
K <-- feature2
and we now have commit J
out. Let's git checkout feature2
to select commit K
:
I--J <-- feature
/
...--F--G--H <-- master
\
K <-- feature2 (HEAD)
Now let's make a new commit L
, in the usual way: edit some files, git add
, and run git commit
. Git will make a new commit from whatever is in the index or staging area (two different terms for the same thing). The snapshot will freeze all these files forever (or for as long as commit L
continues to exist). Git will collect our log message, add our name as author and committer, set the parent of new commit L
to existing commit K
, and last, write L
's new hash ID—whatever is—into the name feature2
so that we have:
I--J <-- feature
/
...--F--G--H <-- master
\
K--L <-- feature2 (HEAD)
The current commit is now our new commit L
, and the branch names point to the commits as we've drawn them. (Of course their real hash IDs are big ugly strings that look random, that we'll never remember, and that we would have to cut and paste to get right. Hence the use of simple letters here.)
Merge is about combining work (except when it's a fast-forward)
Let's draw things this way for a moment:
I--J <-- branch1 (HEAD)
/
...--G--H
\
K--L <-- branch2
You can see from this diagram that we just have two interesting branch names, and we're on branch1
which is commit J
. If we now run git merge branch2
, Git will be forced to do a full merge.
The full merge process starts by finding the merge base commit, which in this case is H
. It then compares the snapshot in H
to the snapshot in J
, where we are now, to see what we changed. These are the ours
changes; J
is the ours
commit. Next, git compares the snapshot in H
to the snapshot in L
, to see what they changed. These are the theirs
changes with L
being the theirs
commit. The merge process will combine the changes, apply the combined changes to the merge base H
, and—if all goes well—make a new merge commit M
:
I--J
/ \
...--G--H M <-- branch1 (HEAD)
\ /
K--L <-- branch2
New commit M
will have, as its snapshot, whatever was in H
, modified by adding both our changes (H
-vs-J
) and their changes (H
-vs-L
). So we keep our changes and add their changes: that's what merging is about, after all. To remember what got merged, new commit M
will have two parents, instead of just one. The first parent will be the commit we had checked out a moment ago, which was commit J
. The second parent will be the other commit, the one we merged: commit L
.
Fast-forward merges have no work to combine
But what if the input picture doesn't look like this? Suppose, instead, we have the simpler picture:
...--G--H <-- master (HEAD)
\
I--J <-- feature
We are on H
now, via the name master
to which HEAD
is attached. We run git merge feature
or git merge hash-of-J
. Git finds the best common commit—the best commit that's on both branches—but that's commit H
, which is the one we're on! If Git did a full blown merge, it would compare H
vs H
to see what we changed. That would of course be nothing at all. Then it would compare H
vs J
to see what "they" (really, we, on feature
) changed. Then it would add those changes to H
. The result would always exactly match commit J
.
If Git did a full merge, we'd get:
...--G--H------K <-- master (HEAD)
\ /
I--J <-- feature
where the snapshot in K
matches that in J
. The difference between K
and J
is that K
is a different commit, with (1) a different hash ID and (2) different parents: both H
and J
. Well, probably (3) K
would have different date-and-time stamps, too.
We can, if we want, make Git do this—but Git's default is to not do this. Instead of making a new commit K
and writing that hash ID into master
, Git can just check out existing commit J
and put that hash ID into master
:
...--G--H
\
I--J <-- master (HEAD), feature
You get the same snapshot, but there's no extra commit. The two names, master
and feature
, now identify the same commit, J
, and there's no reason to draw the graph with a kink in it:
...--G--H--I--J <-- master (HEAD), feature
and we can safely delete the name feature
if we want.
We can tell Git: do a merge, but only if you can do it as a fast-forward. To do this we use:
git merge --ff-only <branch-or-commit-hash>
This tests whether the merge can be just a fast-forward operation. If so, Git doesn't really merge, and just does the fast-forward. If not, Git won't do a full merge. (If we left out the --ff-only
, Git would try a full merge.)
Rebase is really about copying commits
Suppose we have the following graph:
I---J <-- feature
/
...--G--H--K--L <-- master
(We haven't picked a branch to git checkout
yet, so there's no (HEAD)
notation.) If we want to combine these right now, as is, Git would be forced to make a real merge, no matter whether we git checkout master; git merge feature
or git checkout feature; git merge master
. Either way, Git needs to find the merge base H
, do two diff
s, combine changes, and make a merge commit.
If we don't want a merge commit, though, we can rebase commit I
by copying it to a new commit I'
. We'll run:
git checkout feature
git rebase master
Git will list out the commit hash IDs to be copied—which are obvious to us; it's I
and J
—and then will start the rebase by detaching HEAD
from feature
, so that it points directly to commit L
:
I--J <-- feature
/
...--G--H--K--L <-- HEAD, master
This is Git's detached HEAD mode, which rebase uses pretty heavily. Now Git must copy commit I
to a new commit. It should compare I
to its parent H
, to see what changed. Then it should apply these changes to commit L
, and make a new commit. We could call the commit M
, but since it's a copy of I
, we'll call it I'
instead. The name HEAD
will automatically update to point to the new commit:
I--J <-- feature
/
...--G--H--K--L <-- master
\
I' <-- HEAD
The snapshot in I'
is the result of combining H
-vs-I
with H
-vs-L
. That is, this operation, which Git calls a cherry-pick, actually uses the same merge process that git merge
uses! But the final commit, I'
, is a regular non-merge commit, with one parent.
In any case, having copied I
to I'
, Git must now copy J
to a new commit J'
, in the same way: Git will compare I
vs J
to see what "they" (we) changed, and compare I
vs I'
to see what we changed here, and combine these changes. That has the effect of adding I
-vs-J
to our copy I'
so that we have:
I--J <-- feature
/
...--G--H--K--L <-- master
\
I'-J' <-- HEAD
Don't worry if this seems complicated. It is complicated! The end result is pretty clear though: we have new commits I'
and J'
that are "just as good as" the originals, except that they're better because the parent of I'
is L
. So the new chain of two commits is like the old chain, except that:
- it starts from a different snapshot (
L
) and therefore ends with a different snapshot (J'
), and
- it comes right after
L
.
Now that we're done copying commits by cherry-picking, rebase does its last step, which is to move the name feature
to point to the last copied commit, and re-attach our HEAD:
I--J [abandoned]
/
...--G--H--K--L <-- master
\
I'-J' <-- feature (HEAD)
The original commits I-J
are still in the repository, but we can't find them any more, because we always start by looking at the names—feature
or master
—and working backwards. (Eventually—after about 30 days in a normal setup—if no one can find I
and J
, and you haven't deliberately resurrected them to undo your rebase, Git will sweep them away for real, and those snapshots will be gone.)
Doing this kind of rebase makes it possible to fast-forward
What we had before the rebase would have required a real merge. What we have after the rebase is now fast-forward-able. Now that we have:
...--G--H--K--L <-- master
\
I'-J' <-- feature (HEAD)
we can use git checkout master
followed by git merge --ff-only feature
and get:
...--G--H--K--L--I'-J' <-- master (HEAD), feature
just like before.
Sometimes rebase is unneeded
If we start with:
...--G--H <-- master
\
I <-- feature (HEAD)
and run git rebase master
, Git:
- Lists out the commits that are on
feature
but not master
: I
.
- Checks out
master
as a detached HEAD.
- Copies commit
I
to come where commit I
is: this doesn't require any copying and Git just says to itself, ah, let's re-use I
in place.
- Is finished copying, so moves the name
feature
to point here, which is where it already points, and re-attaches HEAD.
The result is a blur of motion—listing and detaching and not really doing anything and then reattaching—resulting in no change at all. We're now ready to git merge --ff-only
.
What about git fetch
?
Your git pull
sequence introduced an extra pair of Git commands. It first runs git fetch
, then a second command, either git rebase
(if you choose that) or git merge
(the default). We've seen above what git merge
can do: a real merge, or a fast forward. But what about the git fetch
step?
What git fetch
is really about is sharing commits with some other Git repository. This means we need to have another Git repository, and put that into our pictures. This other Git repository might be on GitHub or Bitbucket or GitLab or one of those various hosting services, or it might be a work computer, or whatever. But it's a Git repository, and that means that it has commits, and it has its own branches.
Our Git will call up their Git and have them list out their branch names and commit hash IDs. When they list an "interesting" branch name and hash ID, our Git will grab that information. Then our Git will see if we already have that commit, by that hash ID. Remember, hash IDs are unique to each commit, but they have one other key property: every Git everywhere uses the same hash ID for that commit. So either we have the hash ID in our Git, so we have the commit; or, we don't have the hash ID, so we don't have the commit.
If they do have some commits that we don't, we can draw that like this:
our Git repository:
...--G--H <-- master (HEAD)
their Git repository:
...--G--H--I--J <-- master
Our Git will see that we don't have J
. Our Git will then ask their Git for J
, and they will also offer J
's parent I
(by hash ID). Our Git will see that we need that one too, and ask for it (by hash ID), and their Git will offer H
. Our Git will see that we don't need H
and say no thanks, we have that one.
They'll now package up what our Git needs to add I
and J
to our collection, Borg-fashion. They will send that over and our Git will add it to our repository:
...--G--H <-- master (HEAD)
\
I--J <-- ???
But now we need a name, because our Git will only show us commits when it can find them by name. The name our Git will use is a remote-tracking name: we'll take their name master
and stick a prefix like origin/
in front of it.1 So, after the git fetch
finishes and exits, the actual picture we should draw is now:
...--G--H <-- master (HEAD)
\
I--J <-- origin/master
The pull command will now have our Git run either git merge
or git rebase
.2 The default is to use git merge
. Git will merge commit J
, and as long as the merge is a fast-forward—as it is in this case—we'll get:
...--G--H--I--J <-- master (HEAD), origin/master
as our result.
If no commits come in, so that origin/master
and master (HEAD)
select the same commit H
before and after the git fetch
, git pull
won't do anything extra. So the pull (or fetch-and-then-second-command) step is only necessary if the other Git has new commits that we want to incorporate.
1The origin/
part comes from the name of the remote that you use to talk to the other Git. Technically, these refs are in a different namespace, under refs/remotes/
than refs/heads/
. Git normally hides this from us, sometimes a little more, sometimes a little less: git branch
will sometimes show the name origin/master
and sometimes show the name remotes/origin/master
. I don't know why it is not consistent here.
2The pull
command runs this merge-or-rebase on the current branch and only the current branch, regardless of any other names git fetch
may have updated. It uses the raw hash IDs directly and sets up a particular merge message when using git merge
.
Conclusions
The general idea of rebase is: I have some commits, they're OK as is, but they'd be improved if I moved them. You can't actually move a commit—a commit, once made, is 100% read-only—but you can copy them to new-and-improved commits with new and different hash IDs.
A fast-forward operation really means move the branch name to point to some already-existing commit that's further down the chain of commits. Drawing the graph will let you see if that's actually possible. When git merge
does a fast-forward, it also checks out the commit to which it moved the branch name.
A git merge
does a fast-forward if it can, and a real merge if it can't. Adding --ff-only
tells it: If you can't do this as a fast-forward, just tell me that and quit.
Using git fetch
, you can get someone else's commits, from some other Git repository, into your repository. This step is always safe and can be run at any time, on any branch. But, having obtained those commits, you'll need to use a second Git command to actually incorporate their commits into your branch-names.
Whether to use rebase or merge to incorporate fetched commits is a matter of opinion (as is whether to use rebase at all, ever). But whatever you do to incorporate fetched commits, that part happens on your current branch, because both git rebase
and git merge
use the current branch.
The git pull
command means: Run git fetch
, then run a second command to affect the current branch, to incorporate what we fetched. In some cases there's no second command to run, because you didn't pick up anything in the fetch step; and in a rare case—a new repository that had no commits before, hence has no current branch; or right after git checkout --orphan
—there's nothing to rebase-or-merge either. (You probably won't hit this rare case, but back in the bad old days of 2005-or-so, git pull
could wreck your work-tree, if you had one. Fortunately that's long since fixed.)
There's no single right work-flow, but the one you are using is fine.