Question regarding git rebase vs git merge in master

Question

Here's a simple workflow I used in my feature branch. I have only 1 commit that I would like to merge neatly into master (using fast-forward merging).

Work on feature

git checkout -b feature

work, work, work

git add .

git commit -m "finished feature"

Rebase feature branch on top of updated master

git checkout master

git pull

git checkout feature

git rebase master

Merge feature branch on top master

git checkout master

Now should I do git rebase feature OR git merge feature? What would be the difference in this case? Which is the best practice?

git push

Please take a look at this SO question: https://stackoverflow.com/questions/16666089/whats-the-difference-between-git-merge-and-git-rebase/25267150 — Anna, Jan 19 '20 at 23:13

score 4 · Accepted Answer · answered Jan 19 '20 at 22:27

4

If you're using a fast-forward merge, then these two operations are equivalent. git rebase detects when no rebase needs to be done and avoids doing one, and doing a fast-forward merge just updates the head to the new location. The only time it matters which one you choose is when the operation is not a fast forward.

In your case, regardless of which one you do, you'd want to check out master and run git merge --ff-only feature. That will do the fast-forward portion of the operation and fail if it isn't a fast forward.

Doing a git rebase feature would rebase master on top of feature, which wouldn't produce the results you're looking for in this case.

answered Jan 19 '20 at 22:27

bk2204

64,793
6
84
100

Take a look at this video: https://www.youtube.com/watch?v=f1wnYdLEpgI. @5:37 this guy does a git rebase feature on master branch and it seems to work fine though.. thoughts? Do i need --ff-only if my branch is already rebased on latest master? – no_clue_so Jan 19 '20 at 23:11
It may happen to work, but what you're asking for is a different thing. The `merge --ff-only` is the way to go here. – bk2204 Jan 19 '20 at 23:14

score 3 · Answer 2 · answered Jan 20 '20 at 00:19

"Best practice" is a matter of opinion and therefore off-topic on StackOverflow. More important here is that you understand what these various options do, so that you can pick whichever one suits you the most.

I have only 1 commit that I would like to merge neatly into master (using fast-forward merging).

In this case (and given your setup steps), git rebase followed by git merge will work more often than just git merge. That's because sometimes—perhaps most times, depending on who else you might be working with here—the git rebase will do nothing. In the cases where git rebase does something, the thing it does will be necessary.

But your example command sequence is a bit different:

git checkout master
git pull
git checkout feature
git rebase master

What's that git pull doing in there? Well, we'll get there, because git pull means run git fetch, then run a second Git command, and that second command itself is either git merge or git rebase. This means you need to understand at least one of git rebase and git merge, depending on which second command you pick. You also need to understand git fetch.

TL;DR

The following is unavoidably long, because Git is a bit complicated, but there is a short version:

If you like, you can use git merge --ff-only and just let it fail in cases where it can't be used. That tells you: stop, step back, take a close look at what you have now, and decide if you want to rebase, or merge, or do some longer sequence of operations. See bk2204's accepted answer that went in while I was writing all this...

(I also suggest that Git newbies avoid git pull, so that they know exactly which commands they're running. Once you're familiar with the pieces, then you can use git pull as a convenience, if you find it convenient. When I was new to Git, back in 2005 or 2006 or so, I found that git pull made it unclear what's really happening. Another reason to avoid git pull is that what the second command it runs does depends on what the fetch does. You have to know in advance what git fetch is going to fetch! Well, that, or not really care, and not-really-caring is actually kind of common.)

A fast recap of things you may already know, but with pictures

Each commit, in Git, has a unique hash ID. That hash ID is in a sense the "true name" of the commit. No other commit, not even one in some other Git repository, can ever use this commit's hash ID for a different commit. No past or future commit can re-use this ID: it's for this commit, whichever commit "this commit" is.
Every commit stores two groups of things: the data part, which holds a snapshot of all of your files—not changes since the previous commit, just a snapshot—and the metadata part, which holds things like your name and email address, date-and-time stmaps, log messages, and so on. One of the metadata items of every commit is a list of hash IDs of previous or parent commits. Most commits have exactly one previous / parent commit.
A branch name like master holds the hash ID of one commit. That commit is to be considered the last commit in the branch. That's all it has in it, just the one hash ID. It's the commits themselves that remember who comes "before" them.

Whenever something holds a commit hash ID, we say that this thing points to that commit. So we can draw all of this like so:

... <-F <-G <-H   <-- master
               \
                I <-J   <-- feature

The name master here points to commit H, so that H is the last commit in that branch. The name feature points to commit J, so that J is the last commit in that branch. Commit I is in feature, because J points back to I. Commit H is in the feature branch too, though, because I points back to H.

In other words, commits can be on more than one branch at a time. If we create a second branch from master and add a few commits there, we get:

             I--J   <-- feature
            /
...--F--G--H   <-- master
            \
             K   <-- feature2

(I always get lazy and stop drawing in the backwards-pointing arrows between commits, because it's too annoying. Just remember that the "lines" connecting commits are really backwards-pointing arrows from the later commits to the earlier ones.)

In this case, commits up through H are on all three branches, while the I-J set are only on feature and K is only on feature2.

To make a new commit, we git checkout whichever branch name we want the new commit to be "on". That selects the commit and attaches the name HEAD to the branch name. So, if we run git checkout master, we get:

             I--J   <-- feature
            /
...--F--G--H   <-- master (HEAD)
            \
             K   <-- feature2

We now have commit H out. If we run git checkout feature, we get:

             I--J   <-- feature (HEAD)
            /
...--F--G--H   <-- master
            \
             K   <-- feature2

and we now have commit J out. Let's git checkout feature2 to select commit K:

             I--J   <-- feature
            /
...--F--G--H   <-- master
            \
             K   <-- feature2 (HEAD)

Now let's make a new commit L, in the usual way: edit some files, git add, and run git commit. Git will make a new commit from whatever is in the index or staging area (two different terms for the same thing). The snapshot will freeze all these files forever (or for as long as commit L continues to exist). Git will collect our log message, add our name as author and committer, set the parent of new commit L to existing commit K, and last, write L's new hash ID—whatever is—into the name feature2 so that we have:

             I--J   <-- feature
            /
...--F--G--H   <-- master
            \
             K--L   <-- feature2 (HEAD)

The current commit is now our new commit L, and the branch names point to the commits as we've drawn them. (Of course their real hash IDs are big ugly strings that look random, that we'll never remember, and that we would have to cut and paste to get right. Hence the use of simple letters here.)

Merge is about combining work (except when it's a fast-forward)

Let's draw things this way for a moment:

          I--J   <-- branch1 (HEAD)
         /
...--G--H
         \
          K--L   <-- branch2

You can see from this diagram that we just have two interesting branch names, and we're on branch1 which is commit J. If we now run git merge branch2, Git will be forced to do a full merge.

The full merge process starts by finding the merge base commit, which in this case is H. It then compares the snapshot in H to the snapshot in J, where we are now, to see what we changed. These are the ours changes; J is the ours commit. Next, git compares the snapshot in H to the snapshot in L, to see what they changed. These are the theirs changes with L being the theirs commit. The merge process will combine the changes, apply the combined changes to the merge base H, and—if all goes well—make a new merge commit M:

          I--J
         /    \
...--G--H      M   <-- branch1 (HEAD)
         \    /
          K--L   <-- branch2

New commit M will have, as its snapshot, whatever was in H, modified by adding both our changes (H-vs-J) and their changes (H-vs-L). So we keep our changes and add their changes: that's what merging is about, after all. To remember what got merged, new commit M will have two parents, instead of just one. The first parent will be the commit we had checked out a moment ago, which was commit J. The second parent will be the other commit, the one we merged: commit L.

Fast-forward merges have no work to combine

But what if the input picture doesn't look like this? Suppose, instead, we have the simpler picture:

...--G--H   <-- master (HEAD)
         \
          I--J  <-- feature

We are on H now, via the name master to which HEAD is attached. We run git merge feature or git merge hash-of-J. Git finds the best common commit—the best commit that's on both branches—but that's commit H, which is the one we're on! If Git did a full blown merge, it would compare H vs H to see what we changed. That would of course be nothing at all. Then it would compare H vs J to see what "they" (really, we, on feature) changed. Then it would add those changes to H. The result would always exactly match commit J.

If Git did a full merge, we'd get:

...--G--H------K   <-- master (HEAD)
         \    /
          I--J  <-- feature

where the snapshot in K matches that in J. The difference between K and J is that K is a different commit, with (1) a different hash ID and (2) different parents: both H and J. Well, probably (3) K would have different date-and-time stamps, too.

We can, if we want, make Git do this—but Git's default is to not do this. Instead of making a new commit K and writing that hash ID into master, Git can just check out existing commit J and put that hash ID into master:

...--G--H
         \
          I--J   <-- master (HEAD), feature

You get the same snapshot, but there's no extra commit. The two names, master and feature, now identify the same commit, J, and there's no reason to draw the graph with a kink in it:

...--G--H--I--J   <-- master (HEAD), feature

and we can safely delete the name feature if we want.

We can tell Git: do a merge, but only if you can do it as a fast-forward. To do this we use:

git merge --ff-only <branch-or-commit-hash>

This tests whether the merge can be just a fast-forward operation. If so, Git doesn't really merge, and just does the fast-forward. If not, Git won't do a full merge. (If we left out the --ff-only, Git would try a full merge.)

Rebase is really about copying commits

Suppose we have the following graph:

          I---J   <-- feature
         /
...--G--H--K--L   <-- master

(We haven't picked a branch to git checkout yet, so there's no (HEAD) notation.) If we want to combine these right now, as is, Git would be forced to make a real merge, no matter whether we git checkout master; git merge feature or git checkout feature; git merge master. Either way, Git needs to find the merge base H, do two diffs, combine changes, and make a merge commit.

If we don't want a merge commit, though, we can rebase commit I by copying it to a new commit I'. We'll run:

git checkout feature
git rebase master

Git will list out the commit hash IDs to be copied—which are obvious to us; it's I and J—and then will start the rebase by detaching HEAD from feature, so that it points directly to commit L:

          I--J   <-- feature
         /
...--G--H--K--L   <-- HEAD, master

This is Git's detached HEAD mode, which rebase uses pretty heavily. Now Git must copy commit I to a new commit. It should compare I to its parent H, to see what changed. Then it should apply these changes to commit L, and make a new commit. We could call the commit M, but since it's a copy of I, we'll call it I' instead. The name HEAD will automatically update to point to the new commit:

          I--J   <-- feature
         /
...--G--H--K--L   <-- master
               \
                I'  <-- HEAD

The snapshot in I' is the result of combining H-vs-I with H-vs-L. That is, this operation, which Git calls a cherry-pick, actually uses the same merge process that git merge uses! But the final commit, I', is a regular non-merge commit, with one parent.

In any case, having copied I to I', Git must now copy J to a new commit J', in the same way: Git will compare I vs J to see what "they" (we) changed, and compare I vs I' to see what we changed here, and combine these changes. That has the effect of adding I-vs-J to our copy I' so that we have:

          I--J   <-- feature
         /
...--G--H--K--L   <-- master
               \
                I'-J'  <-- HEAD

Don't worry if this seems complicated. It is complicated! The end result is pretty clear though: we have new commits I' and J' that are "just as good as" the originals, except that they're better because the parent of I' is L. So the new chain of two commits is like the old chain, except that:

it starts from a different snapshot (L) and therefore ends with a different snapshot (J'), and
it comes right after L.

Now that we're done copying commits by cherry-picking, rebase does its last step, which is to move the name feature to point to the last copied commit, and re-attach our HEAD:

          I--J   [abandoned]
         /
...--G--H--K--L   <-- master
               \
                I'-J'  <-- feature (HEAD)

The original commits I-J are still in the repository, but we can't find them any more, because we always start by looking at the names—feature or master—and working backwards. (Eventually—after about 30 days in a normal setup—if no one can find I and J, and you haven't deliberately resurrected them to undo your rebase, Git will sweep them away for real, and those snapshots will be gone.)

Doing this kind of rebase makes it possible to fast-forward

What we had before the rebase would have required a real merge. What we have after the rebase is now fast-forward-able. Now that we have:

...--G--H--K--L   <-- master
               \
                I'-J'  <-- feature (HEAD)

we can use git checkout master followed by git merge --ff-only feature and get:

...--G--H--K--L--I'-J'  <-- master (HEAD), feature

just like before.

Sometimes rebase is unneeded

If we start with:

...--G--H  <-- master
         \
          I   <-- feature (HEAD)

and run git rebase master, Git:

Lists out the commits that are on feature but not master: I.
Checks out master as a detached HEAD.
Copies commit I to come where commit I is: this doesn't require any copying and Git just says to itself, ah, let's re-use I in place.
Is finished copying, so moves the name feature to point here, which is where it already points, and re-attaches HEAD.

The result is a blur of motion—listing and detaching and not really doing anything and then reattaching—resulting in no change at all. We're now ready to git merge --ff-only.

What about `git fetch`?

Your git pull sequence introduced an extra pair of Git commands. It first runs git fetch, then a second command, either git rebase (if you choose that) or git merge (the default). We've seen above what git merge can do: a real merge, or a fast forward. But what about the git fetch step?

What git fetch is really about is sharing commits with some other Git repository. This means we need to have another Git repository, and put that into our pictures. This other Git repository might be on GitHub or Bitbucket or GitLab or one of those various hosting services, or it might be a work computer, or whatever. But it's a Git repository, and that means that it has commits, and it has its own branches.

Our Git will call up their Git and have them list out their branch names and commit hash IDs. When they list an "interesting" branch name and hash ID, our Git will grab that information. Then our Git will see if we already have that commit, by that hash ID. Remember, hash IDs are unique to each commit, but they have one other key property: every Git everywhere uses the same hash ID for that commit. So either we have the hash ID in our Git, so we have the commit; or, we don't have the hash ID, so we don't have the commit.

If they do have some commits that we don't, we can draw that like this:

our Git repository:
...--G--H   <-- master (HEAD)

their Git repository:
...--G--H--I--J   <-- master

Our Git will see that we don't have J. Our Git will then ask their Git for J, and they will also offer J's parent I (by hash ID). Our Git will see that we need that one too, and ask for it (by hash ID), and their Git will offer H. Our Git will see that we don't need H and say no thanks, we have that one.

They'll now package up what our Git needs to add I and J to our collection, Borg-fashion. They will send that over and our Git will add it to our repository:

...--G--H   <-- master (HEAD)
         \
          I--J   <-- ???

But now we need a name, because our Git will only show us commits when it can find them by name. The name our Git will use is a remote-tracking name: we'll take their name master and stick a prefix like origin/ in front of it.¹ So, after the git fetch finishes and exits, the actual picture we should draw is now:

...--G--H   <-- master (HEAD)
         \
          I--J   <-- origin/master

The pull command will now have our Git run either git merge or git rebase.² The default is to use git merge. Git will merge commit J, and as long as the merge is a fast-forward—as it is in this case—we'll get:

...--G--H--I--J   <-- master (HEAD), origin/master

as our result.

If no commits come in, so that origin/master and master (HEAD) select the same commit H before and after the git fetch, git pull won't do anything extra. So the pull (or fetch-and-then-second-command) step is only necessary if the other Git has new commits that we want to incorporate.

¹The origin/ part comes from the name of the remote that you use to talk to the other Git. Technically, these refs are in a different namespace, under refs/remotes/ than refs/heads/. Git normally hides this from us, sometimes a little more, sometimes a little less: git branch will sometimes show the name origin/master and sometimes show the name remotes/origin/master. I don't know why it is not consistent here.

²The pull command runs this merge-or-rebase on the current branch and only the current branch, regardless of any other names git fetch may have updated. It uses the raw hash IDs directly and sets up a particular merge message when using git merge.

Conclusions

The general idea of rebase is: I have some commits, they're OK as is, but they'd be improved if I moved them. You can't actually move a commit—a commit, once made, is 100% read-only—but you can copy them to new-and-improved commits with new and different hash IDs.
A fast-forward operation really means move the branch name to point to some already-existing commit that's further down the chain of commits. Drawing the graph will let you see if that's actually possible. When git merge does a fast-forward, it also checks out the commit to which it moved the branch name.
A git merge does a fast-forward if it can, and a real merge if it can't. Adding --ff-only tells it: If you can't do this as a fast-forward, just tell me that and quit.
Using git fetch, you can get someone else's commits, from some other Git repository, into your repository. This step is always safe and can be run at any time, on any branch. But, having obtained those commits, you'll need to use a second Git command to actually incorporate their commits into your branch-names.
Whether to use rebase or merge to incorporate fetched commits is a matter of opinion (as is whether to use rebase at all, ever). But whatever you do to incorporate fetched commits, that part happens on your current branch, because both git rebase and git merge use the current branch.
The git pull command means: Run git fetch, then run a second command to affect the current branch, to incorporate what we fetched. In some cases there's no second command to run, because you didn't pick up anything in the fetch step; and in a rare case—a new repository that had no commits before, hence has no current branch; or right after git checkout --orphan—there's nothing to rebase-or-merge either. (You probably won't hit this rare case, but back in the bad old days of 2005-or-so, git pull could wreck your work-tree, if you had one. Fortunately that's long since fixed.)

There's no single right work-flow, but the one you are using is fine.