Why do I need a merge after a rebase?

Question

As you can see here (https://stackoverflow.com/a/9147389/16430630) I often see that a merge is made after a rebase. If you look at the images below. It looks like the rebase command is doing a merge. Why is it necessary to merge twice (rebase+merge)?

git checkout feature
git rebase master 
git checkout master
git merge feature

What happens if I leave out the last command git merge feature ?

You would still have `feature-a` branch. Now it would be placed on top of master, but master doesn't have those changes, that's what the merge does. — Lasse V. Karlsen, Jul 25 '21 at 12:15
You don't. It is recommended to rebase a branch on top of the target branch often while you work on it in order to catch early any possible merge conflict that might happen when you want to merge. But _merge_ and _rebase_ are different operations with different purposes. — axiac, Jul 25 '21 at 12:15

score 4 · Answer 1 · answered Jul 25 '21 at 12:41

This is a variant on a reverse merge, where you merge master into feature before merging feature into master. The idea is to move the merge base up so as to reduce the likelihood of conflicts on the second merge. Using rebase instead, we also change the history, making it appear that feature diverged from master more recently than it originally did.

What happens if I leave out the last command git merge feature-a

Then you might as well leave out everything. The last command was the goal of the whole operation.

In other words you are asking the wrong question. The question is not why you need a merge after a rebase. It is why you need a rebase before a merge. The answer is, you don't but it makes things nicer.

So the rebase command doesn't do an instant merge? That's why I need the merge at the end? — software, Jul 25 '21 at 13:14
Rebasing feature onto master changes feature but has no effect whatever on master. Merging feature into master changes master and has no effect on feature. — matt, Jul 25 '21 at 13:29

score 3 · Answer 2 · answered Jul 25 '21 at 12:09

The first command (rebase) introduces the changes of the master to your feature branch. You could also use merging here.

The third command (merge) introduces the changes of the feature-a branch to the master. You cannot use rebase here because it would alter the commits already on master which other people based their branches on.

If you don't run the third: The master won't have the changes of feature-a.

torek · Accepted Answer · 2021-07-28T06:49:11.060

Besides matt's answer, it's worth considering what git rebase is about in the first place. Why do we ever use git rebase? There is only one real¹ answer:

We have some existing commit(s).
We don't like something about these existing commits.
So, we'd like to make new-and-improved replacement commits, that fix whatever it is we don't like about the existing commits.

Because git rebase is essentially git cherry-pick on steroids, and using git cherry-pick allows us to alter commits during the copying process, we gain two abilities when we rebase:

First, the new commits often exist at a different "place" in history (in the Git commit graph). We may actually have to work harder to make them appear at the "same place": the default rebase action is to place the new, copied commits after the commit we select as the upstream or --onto argument, depending on how we run git rebase. That's very often the latest commit on the target branch, which is very often not where things were when we started.
Second, using git rebase -i and its edit mode, or when Git rebase pauses with a merge conflict, we can or even must change the effect of the in-progress copy we had going at the time the rebase action paused. We make this change and resume the rebase, and the "copied" commit no longer does exactly the same thing as the original commit, and/or has a different commit message. (It's possible that the commit message alone is the one thing we want to change, in which case we can use the reword action of git rebase -i.)

In the end, though, the result of a rebase is a set of new-and-improved replacement commits, that fix something we didn't like about the original commits. (If the result of the rebase doesn't fix anything—or makes things worse—then we should undo the rebase, which is easy right after rebase finishes.²)

Once we're done with the rebase, it may be the case that a true merge is no longer necessary, even though such a merge was necessary before we started. This happens when one of the changes we make is to put the commits into a new "place in history". The git merge command you run, at the end, may do what Git calls a fast-forward merge instead of a true merge. A fast-forward merge is not a merge at all (and I wish Git did not call it that).

Some people like this fast-forward effect. Some people don't; if you're in the "don't like it" group, you can use git merge --no-ff to force Git to make a true merge, even if a fast-forward non-merge is possible. That's all a matter of personal and/or group preferences. If you or your group likes the fast-forward effect, though, the rebase itself may be required in order to enable the fast-forward effect. This may be why you are doing rebases.

¹The unreal answers are: "because my boss told me to", "because that's what we've always done", and other kinds of cargo-cult programming. Those are answers, they're just not what I would consider good ones.

²Right after the rebase has finished, we can run git reset --hard ORIG_HEAD and all is as if we never started the rebase in the first place. Or, if the rebase is going very badly in the middle, we can use git rebase --abort, which has the same effect: it is as if we never started a rebase at all. There is a lot going on behind the scenes here, and files that aren't in Git may not be restored by this process, of course.

Update per question update

Let me draw, as text, the same pictures you drew. Well, almost the same.

We start with this:

...--G--H   <-- somebranch (HEAD), origin/somebranch

in your own repository. You now add two new commits—we'll call these K and L, skipping over I and J to reserve them:

...--G--H   <-- origin/somebranch
         \
          K--L   <-- somebranch (HEAD)

At this point, you're ready to combine your work with any work someone else may have done, so you run git fetch origin. This brings into your repository two new commits that someone else wrote. Let's draw them in now:

          I--J   <-- origin/somebranch
         /
...--G--H
         \
          K--L   <-- somebranch (HEAD)

This is analogous to your "before rebasing" picture, except that I'm using the same name on both sides (somebranch), rather than the name main. Your Git renames their branch names to make your remote-tracking names; that's why you have an origin/somebranch at all, and that's where we picked up the two new commits.

In order to combine our work (K-L) with their work (I-J), we must make a choice: rebase, merge, or some combination of both. (Not many people do the last one as it is a fair bit of extra work, and it's usually important to get a barely-adequately-working answer as fast as possible, rather than a beautifully-crafted-internally answer next week. Nobody looks at the innards of software, the buyers just pay for getting it done yesterday!)

A regular merge is the simplest answer. Git will:

compare the snapshot in commit H to that in commit J to see what they changed;
compare the snapshot in commit H to that in commit L to see what we changed;
do its best to combine these two sets of changes and apply the combined result to the snapshot in H.

This keeps our changes and adds theirs, or—equivalently—keeps their changes and adds ours. The resulting merge commit M looks like this:

          I--J   <-- origin/somebranch
         /    \
...--G--H      M   <-- somebranch (HEAD)
         \    /
          K--L

and since commit M comes after commit J, without losing any of their work in other words, as well as coming after commit L, keeping our work too, we can now git push commit M to some shared repository over on origin. (Commit M will bring commits K-L along with it, automatically.)

For those who dislike merges, though, we can instead, copy our commit K to a new-and-improved commit K'. We start by making a temporary branch name:

          I--J   <-- temp (HEAD), origin/somebranch
         /
...--G--H
         \
          K--L   <-- somebranch

We then have our Git copy the effect of commit K—by comparing H-vs-K to see what we changed—and apply that to commit J, where we are now. If all goes well, Git will take the merged results—this is a merge, just like git merge—and make a new non-merge, ordinary commit, whose effect on J is like K's effect on H. We'll call this new commit K', and re-use K's commit message:

               K'  <-- temp (HEAD)
              /
          I--J   <-- origin/somebranch
         /
...--G--H
         \
          K--L   <-- somebranch

We now need Git to do the same thing with commit L. This time, the internal merge that git cherry-pick uses will compare commit K vs L to figure out what we changed, and K-vs-K' to figure out what "they" changed. (That's really what they-and-we changed: K' is I+J+K, after all, with respect to where we all started at H. But Git still refers to this as --theirs.)

The result of this cherry-pick merge is, if all goes well, a new commit L':

               K'-L'  <-- temp (HEAD)
              /
          I--J   <-- origin/somebranch
         /
...--G--H
         \
          K--L   <-- somebranch

The last part of a git rebase consists of "peeling the branch name off" the old commit chain, K-L in this case, and pasting it on the end of the new chain:

               K'-L'  <-- somebranch (HEAD)
              /
          I--J   <-- origin/somebranch
         /
...--G--H
         \
          K--L   [abandoned]

If we strip out the [abandoned] section, this is more or less the same as what you drew as your "after" picture.

This means you have the right picture.

Now, let's suppose that the goal of all of this work was to be able to make the name main move forward. The name main currently points to commit J. That is, instead of origin/somebranch in our picture, we should have this:

               K'-L'  <-- somebranch (HEAD)
              /
          I--J   <-- main
         /
...--G--H

The rebase has done its job: it copied K-L, which came off H, to new and improved commits that now come after J. But the name main has not moved.

Doing a:

git checkout main
git merge somebranch

tells Git to figure out whether a true merge is required, or not. A true merge is required when there's an actual set of branching commits, as there were with our "regular merge" example when we made commit M. It's optional in this other case, where we could just "slide the name main forward" (and up because of the kink in the drawing, which is only there because this is limited ASCII art).

The default action for a merge command where a fast-forward operation is possible is to do the fast-forward instead of the merge. The result is:

               K'-L'  <-- main (HEAD), somebranch
              /
          I--J
         /
...--G--H

Note that HEAD is attached to main now, because we ran git checkout main before we ran git merge somebranch.

We don't have to do this in any sort of absolute sense, because Git does not care about branch names. We could just start using the name somebranch as the main branch now, and even just delete the name main entirely:

               K'-L'  <-- somebranch (HEAD)
              /
          I--J
         /
...--G--H

But if we, as mere humans, can't hack that—and we probably can't—we should git checkout main and have Git slide the name forward, and then maybe delete the name somebranch instead:

               K'-L'  <-- main (HEAD)
              /
          I--J
         /
...--G--H

and stop drawing the kinks too, and drop the prime marks from the copied K-and-L commits:

...--G--H--I--J--K--L   <-- main (HEAD)

It now looks like we made commits K and L after, and while seeing, commits I-J. In fact, we made the originals—not called K and L, and abandoned some time ago—before we had access to I-J. But we're so sure that we don't need those originals any more that we're willing to make it impossible, or at least painful, to find them ever again.

I still haven't fully understood the rebase command. For a better understanding, I edited the question and added pictures. So, I still don't know why to use a merge. — software, Jul 28 '21 at 05:47
The rebase command is complicated (perhaps unnecessarily so: I'm not a huge fan of certain of its wrinkles). The decision of when and whether to rebase is at least partly opinion, so there's no right answer. I'll add a quick text drawing relating to your own drawings, though. — torek, Jul 28 '21 at 06:23
Ok, the merge in the end is done because main should point to the last commit and to keep the name "main"? — software, Jul 28 '21 at 09:43
@software: more or less, yes. There's umpteen ways to skin this cat (apologies to my cats); letting `git merge` do a fast-forward is many people's favorite. — torek, Jul 28 '21 at 10:00
This is really helpful. The bit I don't get is that after I have done `git rebase origin/master` I think I end up approx. here: `The last part of a git rebase consists of "peeling the branch name off" the old commit chain, K-L in this case, and pasting it on the end of the new chain:` but `git status ` says `use "git pull" to merge the remote branch into yours`. So it requires a merge to my own branch, before merging that to master, and it throws up the same conflict as I just solved during the rebase. (Needless to say I worked for years with everybody being happy with squash+merge.) — nsandersen, Aug 25 '22 at 10:48
@nsandersen: The problem here is that *your* repository has, in effect, discarded the old (and lousy?) commits with the new-and-improved replacements—but that's just in *your repository*. All *other* Git repositories think that the old commits are "IMPORTANT VALUABLE DATA THAT MUST NEVER BE DISCARDED!!!" (they're not, obviously, since you threw them out yourself). You now have to convince *every other Git repository in the universe that has the old commits* to **also** discard the old-and-lousy commits in favor of the new-and-improved ones. But they don't know that. — torek, Aug 25 '22 at 19:38
Your Git is too stupid to know that when they say "but I have these here commits that we should all definitely keep", you *already decided* to throw those out. Your Git recommends to you that you add those valuable new-to-you (not really new, but Git is dumb) commits to your collective. Git is built to *add new commits*, not to *throw out old ones*. Rebase deliberately throws out old ones (though they remain in the database), and hence doesn't fit in the model. — torek, Aug 25 '22 at 19:39
This whole craziness—the part where you have to convince *every other Git repository that has the old commits* to make the same tradeoff that you've made yourself—is *why* people are reluctant to rebase commits they've shared. Think of it from the next guy's point of view: It's particularly galling to have someone give you a dozen commits, then tell you "hey replace those 12 commits with these 12 improved ones" *after* you've built your own five commits *atop* the original 12. Now *you* have to rebase *your* five commits on the new 12. — torek, Aug 25 '22 at 19:44
Cool - thank you. I have to push/publicise to run CI pipelines. Which in turn suggests to me that a "manual rebase" (transfer to new branch) or squash/merge is less hassle in these cases. — nsandersen, Aug 27 '22 at 08:18
@nsandersen: I'm somewhat convinced that centralized CI build systems that work with Git are ... *broken* is the wrong word; *mis-designed* is closer. Using GitHub-style forks alleviates some problems, but either Git needs a functional "evolve" extension (like Mercurial's) or something else has to be done here. — torek, Aug 27 '22 at 22:29

Why do I need a merge after a rebase?

3 Answers3

Update per question update