7

The man page git-rebase(1) says:

-m
--merge
Use merging strategies to rebase. [...]

But of course one can also run into "merge conflicts" without using the --merge option. So also in that case there must be any "merge strategy" to handle these conflicts.

What difference makes the --merge option to a rebase.

It seems to be something rather fundamental: For a rebase --merge, Git stores its working files in a folder named $GIT_DIR/rebase-merge (as it does for interactive rebases). If the --merge option is not used (and the rebase is non-interactive) that folder is named $GIT_DIR/rebase-apply.

Jürgen
  • 387
  • 2
  • 8
  • 1
    Interesting question. The manual suggests that you would need it when a file was renamed in the upstream, but I just tested it and a plain rebase dealt with that situation automatically and applied to commit to the renamed file. So I'm interested in the answer too, if someone knows. – joanis Apr 29 '19 at 14:59
  • 2
    With some more experimentation, my guess is now that it enables specifying the merge strategy (via `-s` and/or `-X`, which both imply `-m`). I say this because although `-m` technically changes the algorithm used, in the several cases I tested the final result was identical. More confusing is that I was able to create a scenario where `git merge upstream` and `git rebase upstream` gave different results, but in that case `git rebase -m upstream` gave the same results as `git rebase upstream`, although the log messages looked different along the way. – joanis Apr 29 '19 at 15:15
  • 1
    For the record, I've tested with a file being renamed in the upstream, and that did not confuse any rebase or merge. I've also tried having a change done in upstream, the same change done and undone in my branch, which is the case where merge differs from upstream, but the two rebases behaved identically: have the change undone in the final result, whereas merge keeps the change done in the final result. – joanis Apr 29 '19 at 15:20

2 Answers2

9

In one sentence, what -m or --merge does for git rebase is to make sure that rebase uses git cherry-pick internally.

The -m flag to force cherry-pick is often, but not always, redundant. In particular, any interactive rebase always uses cherry-pick anyway. As joanis noted in a comment, specifying any -s or -X options also force the use of cherry-pick. So does -k, as noted below.

Long (or at least longer)

Rebase has a long history in Git: the first rebase operations were done by formatting each commit-to-be-rebased into a patch, then applying the patch to some other commit. That is, originally, git rebase was mostly just:

branch=$(git symbolic-ref --short HEAD)
target=$(git rev-parse ${onto:-$upstream})
git format-patch $upstream..HEAD > $temp_file
git checkout $target
git am -3 $temp_file
git checkout -B $branch HEAD

(except for argument handling, all the error checking, and the fact that the git am can stop with an error, requiring hand-fixing and git rebase --continue; also, the above scripting is my reduced-for-readability version and probably does not resemble the original script much).

This kind of rebase handles most cases fairly well. The most common case that it doesn't handle well involves rebasing across some file renames. It also cannot copy an "empty" commit—one whose patch is empty, that is—as git format-patch is not allowed to omit the patch part.

These empty commits are normally omitted by git rebase even when using -m; you must add -k to preserve them. To preserve them, git rebase must switch to the cherry-pick variant, if it has not already done so.

To pass -s or -X arguments, rebase must invoke git cherry-pick rather than git am, so any of those flags also require the cherry-pick variant.

Using git format-patch never does any rename detection. Hence, if the stream of commits you're copying should all have rename detection applied with respect to HEAD, the -m flag is very important. For a concrete example, consider this series of commits:

          B--C--D   <-- topic
         /
...--o--A--E--F--G   <-- mainline

Suppose that the difference from A to B, B to C, and C to D is all handled within a file named lib-foo.ext. But in commit F, this file is renamed to be lib/foo.ext instead. A git format-patch of A..D will show changes to be made to file lib-foo.ext, none of which will apply correctly to commit G as there is no lib-foo.ext file. The rebase as a whole will fail.

A git cherry-pick of commit B when HEAD identifies commit G, however, will find the rename and apply the A-vs-B changes to the version of lib/foo.ext in commit G:

          B--C--D   <-- topic
         /
...--o--A--E--F--G   <-- mainline
                  \
                   B'   <-- HEAD [detached]

The next cherry-pick, of C while HEAD identifies B', will discover that the B-to-C change to libfoo.ext should be applied to the renamed lib/foo.ext, and the last cherry-pick of D will do the same, so that the rebase will succeed.

The rename detection code is slow, so a rebase that has no renames to do, and no "empty" commits to keep, can run much faster when run via the git format-patch | git am system. That's about the only way in which the original method is better than the cherry-pick variant: it's faster in constrained cases. (However, the speed improvement only occurs when there are lots of rename candidates, but either none of them are actual renames, or none of them matter.)

(Side note: the -3 argument, or --3way to use the longer spelling, tells git am to pass that flag on to each git apply, where the apply will attempt to do a three-way merge if needed, using the blob hashes in the index line in the diff. Under some conditions, it seems like this might suffice to handle renamed files—in particular if the blob hash exactly matches. The cherry-pick method does full rename detection, which handles inexact matches; -3 cannot do that. See also What is the difference between git cherry-pick and git format-patch | git am?, as Jürgen noted.)

torek
  • 448,244
  • 59
  • 642
  • 775
  • Thank you, torek, for this very enlightening answer. Up to now, I pictured also cherry-picks as appliances of previously formatted patches. Concerning that, I found the thorough answers to this, thus related question very helpful: ["What is the difference between git cherry-pick and git format-patch | git am?"](https://stackoverflow.com/q/52119937/11402257) – Jürgen Apr 29 '19 at 18:06
  • 1
    Thanks @torek for this detailed answer, this is very helpful! I still wonder, though: in my tests, I simulated a rename situation that should have failed in the `format-path | am` pipeline, yet succeeded anyway. Is there some heuristic in `git rebase` that sometimes switches to the cherry pick variant even for a non-interactive rebase with no switches? My test was with a `dev.upstream` branch having a rename commit, and the current branch having an edit on that file before the renaming, and `git rebase dev.upstream` worked as is, applying the change to the renamed file. – joanis Apr 29 '19 at 19:29
  • 2
    @joanis: no, there isn't (or wasn't the last time I looked, which might have been 2.15ish), and I would expect that to have failed in at least some cases. Note though that `git am` uses `git apply -3` to do three-way matching on blobs, so a pure rename (as opposed to a rename-with-mods) might (maybe) be found. I'd have to experiment a bit to check the all the nitty details here. – torek Apr 29 '19 at 20:21
0

Since Git 2.26 (Q1 2020), "git rebase"(man) has learned to use the merge backend (i.e. the machinery that drives "rebase -i") by default, while allowing "--apply" option to use the "apply" backend (e.g. the moral equivalent of "format-patch piped to am").
The rebase.backend configuration variable can be set to customize.

This helps illustrates the difference between --merge (now the default) and the old --apply.

See commit 10cdb9f, commit 2ac0d62, commit 8295ed6, commit 76340c8, commit 980b482, commit c2417d3, commit 6d04ce7, commit 52eb738, commit 8af14f0, commit be50c93, commit befb89c, commit 9a70f3d, commit 93122c9, commit 55d2b6d, commit 8a997ed, commit 7db00f0, commit e98c426, commit d48e5e2 (15 Feb 2020), and commit a9ae8fd, commit 22a69fd (16 Jan 2020) by Elijah Newren (newren).
(Merged by Junio C Hamano -- gitster -- in commit 8c22bd9, 02 Mar 2020)

rebase: change the default backend from "am" to "merge"

Signed-off-by: Elijah Newren

The am-backend drops information and thus limits what we can do:

  • lack of full tree information from the original commits means we cannot do directory rename detection and warn users that they might want to move some of their new files that they placed in old directories to prevent their becoming orphaned.
  • reduction in context from only having a few lines beyond those changed means that when context lines are non-unique we can apply patches incorrectly.
  • lack of access to original commits means that conflict marker annotation has less information available.
  • the am backend has safety problems with an ill-timed interrupt.

Also, the merge/interactive backend have far more abilities, appear to currently have a slight performance advantage and have room for more optimizations than the am backend (and work is underway to take advantage of some of those possibilities).

git rebase now includes in its man page:

Interruptability

The am backend has safety problems with an ill-timed interrupt; if the user presses Ctrl-C at the wrong time to try to abort the rebase, the rebase can enter a state where it cannot be aborted with a subsequent git rebase --abort.
The interactive backend does not appear to suffer from the same shortcoming. (See this thread for details.)


Since then, Git 2.39 (Q4 2022) fixes some bugs in the reflog messages when rebasing and changes the reflog messages of "rebase --apply" to match "rebase --merge" with the aim of making the reflog easier to parse.

Again, it illustrates another difference between the two options, which is now resolved.

See commit 9a1925b, commit 6159e7a, commit be0d29d, commit 33f2b61, commit 1f2d5dc, commit da1d633, commit 4e5e1b4, commit 57a1498 (12 Oct 2022) by Phillip Wood (phillipwood).
See commit a524c62 (17 Oct 2022) by Junio C Hamano (gitster).
(Merged by Taylor Blau -- ttaylorr -- in commit 8851c4b, 30 Oct 2022)

rebase --apply: make reflog messages match rebase --merge

Signed-off-by: Phillip Wood

The apply backend creates slightly different reflog messages to the merge backend when starting or finishing a rebase and when picking commits.
These differences make it harder than it needs to be to parse the reflog (I have a script that reads the finishing messages from rebase and it is a pain to have to accommodate two different message formats).
While it is possible to determine the backend used for a rebase from the reflog messages, the differences are not designed for that purpose.
c2417d3 ("rebase: drop '-i' from the reflog for interactive-based rebases", 2020-02-15, Git v2.26.0-rc0 -- merge listed in batch #8) removed the clear distinction between the reflog messages of the two backends without complaint.

As the merge backend is the default it is likely to be the format most common in existing reflogs.
For that reason the apply backend is changed to format its reflog messages to match the merge backend as closely as possible.
Note that there is still a difference as when committing a conflict resolution the apply backend will use "(pick)" rather than "(continue)" because it is not currently possible to change the message for a single commit.

In addition to c2417d3 we also changed the reflog messages in 68aa495 ("rebase: implement --merge via the interactive machinery", 2018-12-11, Git v2.21.0-rc0 -- merge) and 2ac0d62 (rebase: change the default backend from , 2020-02-15, Git v2.26.0-rc0 -- merge listed in batch #8) (rebase: change the default backend from "am" to "merge", 2020-02-15).
This commit makes the same change to "git rebase --apply"(man) that 2ac0d62 made to git rebase(man) without any backend specific options.
As the messages are changed to use an existing format any scripts that can parse the reflog messages of the default rebase backend should be unaffected by this change.

There are existing tests for the messages from both backends which are adjusted to ensure that they do not get out of sync in the future.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250