6

I often use git rebase -i to clean up my history before publishing it. Usually I want to edit commits back to wherever the current branch forked off, without changing its fork point. I do it something like this: git rebase -i $(git show-branch --merge-base $PARENT_BRANCH HEAD)

It's an ugly command and I'm trying to find a better way. As long as I'm at it, I'd like to have git automatically figure out the right parent.

I think what I want is an alias for git rebase -i --fork-point $(something), where something finds the branch with the most recent common ancestor of the current branch. It doesn't need to be bulletproof. If it works for a linear topic branch, that's good enough for my purposes.

Andrew
  • 4,058
  • 4
  • 25
  • 37

3 Answers3

9

With Git 2.24 (Q4 2019), no more git rebase -i --onto @{upstream}...HEAD

The new "git rebase --keep-base <upstream>" tries to find the original base of the topic being rebased and rebase on top of that same base, which is useful when running the "git rebase -i" (and its limited variant "git rebase -x").

The command also has learned to fast-forward in more cases where it can instead of replaying to recreate identical commits.

See commit 414d924, commit 4effc5b, commit c0efb4c, commit 2b318aa (27 Aug 2019), and commit 793ac7e, commit 359eceb (25 Aug 2019) by Denton Liu (Denton-L).
Helped-by: Eric Sunshine (sunshineco), Junio C Hamano (gitster), Ævar Arnfjörð Bjarmason (avar), and Johannes Schindelin (dscho).
See commit 6330209, commit c9efc21 (27 Aug 2019), and commit 4336d36 (25 Aug 2019) by Ævar Arnfjörð Bjarmason (avar).
Helped-by: Eric Sunshine (sunshineco), Junio C Hamano (gitster), Ævar Arnfjörð Bjarmason (avar), and Johannes Schindelin (dscho).
(Merged by Junio C Hamano -- gitster -- in commit 640f9cd, 30 Sep 2019)

rebase: teach rebase --keep-base

A common scenario is if a user is working on a topic branch and they wish to make some changes to intermediate commits or autosquash, they would run something such as

git rebase -i --onto master... master

in order to preserve the merge base.
This is useful when contributing a patch series to the Git mailing list, one often starts on top of the current 'master'.
While developing the patches, 'master' is also developed further and it is sometimes not the best idea to keep rebasing on top of 'master', but to keep the base commit as-is.

In addition to this, a user wishing to test individual commits in a topic branch without changing anything may run

git rebase -x ./test.sh master... master

Since rebasing onto the merge base of the branch and the upstream is such a common case, introduce the --keep-base option as a shortcut.

This allows us to rewrite the above as

git rebase -i --keep-base master

and:

git rebase -x ./test.sh --keep-base master

respectively.

git rebase man page now includes:

--keep-base:

Set the starting point at which to create the new commits to the merge base of <upstream> <branch>.
Running 'git rebase --keep-base <upstream> <branch>' is equivalent to running 'git rebase --onto <upstream>... <upstream>'.

This option is useful in the case where one is developing a feature on top of an upstream branch.
While the feature is being worked on, the upstream branch may advance and it may not be the best idea to keep rebasing on top of the upstream but to keep the base commit as-is.

Although both this option and --fork-point find the merge base between <upstream> and <branch>, this option uses the merge base as the starting point on which new commits will be created, whereas --fork-point uses the merge base to determine the set of commits which will be rebased.


mvds suggests in the comments combining this with git rebase --reapply-cherry-picks

In my workflow, to develop a minimal binary patch for firmware written in C, I need to go back to a certain commit hash, make a branch, perform modifications and find differences in the resulting binary.

In the time since that certain commit, our compiler was upgraded and became smarter, turning the codebase into non-compiling code.

To remedy this, I cherry-pick any commits needed to be able to compile again.
Those cherry-picked commits will mess up the magic done by --keep-base, unless --reapply-cherry-picks is used.

The debate is on the Git mailing-list


Before Git 2.39 (Q4 2022), "git rebase --keep-base"(man) used to discard the commits that are already cherry-picked to the upstream, even when "keep-base" meant that the base, on top of which the history is being rebuilt, does not yet include these cherry-picked commits.
The --keep-base option now implies --reapply-cherry-picks and --no-fork-point options.

See commit aa1df81, commit ce5238a, commit d42c9ff, commit a770602, commit f21becd, commit b8dbfd0, commit 05ec418, commit 96601a2 (17 Oct 2022) by Phillip Wood (phillipwood).
(Merged by Taylor Blau -- ttaylorr -- in commit 003f815, 30 Oct 2022)

rebase --keep-base: imply --no-fork-point

Signed-off-by: Phillip Wood

Given the name of the option it is confusing if --keep-base actually changes the base of the branch without --fork-point being explicitly given on the command line.

The combination of --keep-base with an explicit --fork-point is still supported even though --fork-point means we do not keep the same base if the upstream branch has been rewound.
We do this in case anyone is relying on this behavior which is [tested in t3431](https://lore.kernel.org/git/20200715032014.GA10818@generichostname/

git rebase now includes in its man page:

git rebase --reapply-cherry-picks --no-fork-point --onto <upstream>...<branch> <upstream> <branch>.

git rebase now includes in its man page:

If <upstream> or --keep-base is given on the command line, then the default is --no-fork-point, otherwise the default is --fork-point. See also rebase.forkpoint in git config.

And still with Git 2.39:

rebase --keep-base: imply --reapply-cherry-picks

Reported-by: Philippe Blain
Signed-off-by: Phillip Wood

As --keep-base does not rebase the branch it is confusing if it removes commits that have been cherry-picked to the upstream branch.
As --reapply-cherry-picks is not supported by the "apply" backend this commit ensures that cherry-picks are reapplied by forcing the upstream commit to match the onto commit unless --no-reapply-cherry-picks is given.

git rebase now includes in its man page:

rebasing on top of the upstream but to keep the base commit as-is. As the base commit is unchanged this option implies --reapply-cherry-picks to avoid losing commits.

git rebase now includes in its man page:

In the absence of --keep-base (or if --no-reapply-cherry-picks is given), these commits will be automatically dropped.

Because this necessitates reading all upstream commits, this can be expensive in repositories with a large number of upstream commits that need to be read.
When using the 'merge' backend, warnings will be issued for each dropped commit (unless --quiet is given). Advice will also be issued unless advice.skippedCherryPicks is set to false (see git config).

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • I don't have 2.24 yet (I stick to what's in APT, and it's not in the repos), but this feature is exactly what I was looking for so I'm accepting it anyway. – Andrew Jan 21 '20 at 20:34
  • I would like to add that --reapply-cherry-picks may be of use: In my workflow, to develop a minimal binary patch for firmware written in C, I need to go back to a certain commit hash, make a branch, perform modifications and find differences in the resulting binary. In the time since that certain commit, our compiler was upgraded and became smarter, turning the codebase into non-compiling code. To remedy this, I cherry-pick any commits needed to be able to compile again. Those cherry-picked commits will mess up the magic done by --keep-base, unless --reapply-cherry-picks is used! – mvds Sep 21 '20 at 19:03
  • 1
    @mvds Excellent point, thank you. I have included your comment in the answer for more visibility. – VonC Sep 21 '20 at 20:03
  • Apparently it is a point of discussion whether --keep-base should imply --reapply-cherry-picks: https://public-inbox.org/git/0EA8C067-5805-40A7-857A-55C2633B8570@gmail.com/ – mvds Sep 22 '20 at 13:02
4

First, --fork-point is meant for remote-tracking names. The way it works is that it uses the reflog for the supplied upstream. For more about this, see, e.g., Git rebase - commit select in fork-point mode.

Second—but maybe more important—you can run git rebase upstream, git rebase --onto newbase upstream, or even just git rebase. When using the two argument form, you gain a lot of freedom:

  • The upstream argument limits which commits are going to be copied. We'll have more to say about this in a moment.

  • The newbase argument chooses which commit is the point after which the copies are to be added. The actual hash ID here is very important.

That is, when you do use --onto, you must pick an exact commit for the --onto newbase argument. This means you can be very loose about what you use as the upstream argument. But when you don't use --onto, upstream gets used for both purposes, so the upstream parameter is the one that requires a lot of careful preciseness. One purpose is loose and free, and the other isn't.

What this means that you can use two arguments to regain any extra looseness you'd like (but may not need), or no arguments to gain convenience.

(The next part is constructed oddly due to an update to the question.)

git show-branch --merge-base X Y = git merge-base X Y

The git show-branch command now takes --merge-base or --independent when given multiple arguments. With exactly two arguments, it does the same thing as git merge-base X Y, i.e., it finds the merge base(s) of revisions X and Y. (git merge-base also now takes --independent, rather than just assuming the octopus strategy, but this only applies when using three or more commit specifiers.)

I prefer git merge-base here, as I think it's more obvious (and of course a little shorter to type in).

With no arguments, the upstream comes from your branch settings

Every branch can have one (but only one) upstream setting.

The upstream setting of any branch B is usually origin/B, because we tend to push them to GitHub or Bitbucket or some corporate web server, whose URL we stick under the name origin so that we don't have to type it out all the time. If you've used up your one upstream on origin/, and want to have a second setting that git rebase will use automatically, you're sort of out of luck (but read on). But if you haven't used up the one upstream, just set the upstream to what you'd like git rebase to use automatically. For instance, if you're on feature-X now and want it to rebase on develop:

$ git branch --set-upstream-to=develop

and now feature has develop as its one upstream. Running git rebase (with or without -i) will rebase as if you ran the same command with develop as its upstream argument.

If you have used up the one upstream, you can make your own alias:

alias.name = !git rebase "$@" $(git config --get branch.$(git symbolic-ref --short HEAD).base) #

(pick some name): this lets you configure, using git config, an extra name, branch.feature-X.base, to develop. The $(git symbolic-ref --short) extracts the current branch name, the git config --get gets the setting, and the rebase then uses that as its one upstream argument.

The drawback to a single limiter-and-target / newbase argument

The drawback here is that, given a graph of the form:

             o--o   <-- develop
            /
...--o--o--o
            \
             A--B--C   <-- feature-X (HEAD)

you wind up with copies that come after the tip of develop:

                  A'-B'-C'  <-- feature-X (HEAD)
                 /
             o--o   <-- develop
            /
...--o--o--o
            \
             A--B--C   [abandoned]

when you want to keep the copies in the same place, just fuss with their commit text or maybe squash two together or something:

             o--o   <-- develop
            /
...--o--o--o--A'-B'-C'  <-- feature-X (HEAD)
            \
             A--B--C   [abandoned]

Using --onto

With the two-argument form, the --onto parameter picks the target commit:

             o--o   <-- develop or whatever
            /
...--o--o--*    [pick this commit as target]
            \
             A--B--C   <-- feature-X (HEAD)

The copies will now go after *. The set of commits to be copied is determined by using, in effect, upstream..feature-X: that is, the commits reachable by starting at feature-X and working backwards, but excluding commits reachable by starting at upstream and working backwards.

Now you need only find commit *. If you have two names, such as feature-X and develop, you can use the gitrevisions three-dot syntax, develop...feature-X or feature-X...develop (the syntax is symmetric when there is only one merge base like this), to specify commit *. This only works in fairly new versions of Git: in older ones, use git show-branch or git merge-base (with two commit hash IDs they both behave the same way).

Having specified commit * as the --onto target, you can again allow the branch's upstream to work as the limiter. That is, you can omit the explicit upstream because it defaults to the actual upstream. And, since you can use the @{upstream} or @{u} syntax, you can make a very short and simple alias, as you did in your own answer.

If you want to keep a separate upstream (origin/feature-X) and base, you can go back to the idea of configuring an extra value per branch name. In this case, you'll need to use it twice, so instead of an alias, you might want a full blown script, where you can do error checking:

#! /bin/sh
# git-base - rebase on the current branch's base setting
. git-sh-setup
branch=$(git symbolic-ref --short HEAD) || exit
base=$(git config --get branch.$branch.base) || die "branch.$branch.base is not set"
mbase=$(git merge-base $base HEAD) || die "there is no merge base"
git rebase --onto $mbase "$@" $base

Name this git-base, put it in your path, and you can now run git base.

torek
  • 448,244
  • 59
  • 642
  • 775
  • "it may be the case that you aren't actually using a parent branch name here, or are running git show-branch --merge-base HEAD $PARENT_BRANCH." It probably varies, honestly, depending on whether I remember the HEAD part on any given day. Fixed that line in the question so it actually makes sense. – Andrew Nov 10 '18 at 02:47
  • I'm not sure which answer to mark as accepted. Mine concisely answers exactly the question I asked, but yours is much more informative in general. Is there established etiquette on this? – Andrew Nov 14 '18 at 18:46
  • Might as well take your own, I guess. I'm not sure what if anything is standard here. – torek Nov 14 '18 at 22:34
  • Okay, done. But I added a link to yours for a more detailed explanation of the mechanics behind the solution. – Andrew Nov 15 '18 at 00:51
2

After plugging at this for an hour or so, I came up with something good-enough. This command does what I want if the branch has an upstream set, and if said upstream is the one I want to compare against:

$ git rebase -i --onto @{upstream}...HEAD

Those triple dots do not do the same thing as in git-log and similar commands. From the git-rebase documentation:

As a special case, you may use "A...B" as a shortcut for the merge base of A and B if there is exactly one merge base.

So this is saying "find the merge-base of HEAD and its own upstream, and rebase against that". Normally local branches have no upstream, but it can be set, and there is a gitconfig option (autoSetupMerge) that will do it automatically for new branches. Hence:

$ git config --global branch.autoSetupMerge always
$ git config --global alias.fixup 'rebase -i --onto @{upstream}...HEAD'
$ git branch childbranch -u parentbranch  # Repeat for other branches as needed.

After this I can edit history back to the branch point easily with:

git fixup

And it works for all future branches.

(note: See torek's answer for a detailed explanation of what's going on under the hood here)

Andrew
  • 4,058
  • 4
  • 25
  • 37
  • Using `--onto` is probably the way to go. I've been working on a long-form answer. That three-dot syntax is the correct syntax for specifying the (well, *a*) merge base; see [gitrevisions](https://www.kernel.org/pub/software/scm/git/docs/gitrevisions.html). – torek Nov 10 '18 at 02:08
  • Can you quote where gitrevisions says that? All the references I can find to triple-dots use it to mean roughly "the set of commits going back to the merge-base" not "the merge-base commit itself". I was surprised to find that it worked that way with --onto. – Andrew Nov 10 '18 at 02:20
  • It's a little sneaky, because A...B really is a range syntax for anything that uses ranges. The trick is that Git commands that want a single number pass that to `git rev-parse` (or the C code equivalent), which spits out the two end points as positive references and the merge base(s) as negative references. `git diff` and, in modern Git, `git rebase`, know how to handle that. – torek Nov 10 '18 at 02:26
  • In this particular case, if you examine the old shell-script rebase setup code, it [explicitly checks for the three-dot syntax and runs `git merge-base`](https://github.com/git/git/blob/8858448bb49332d353febc078ce4a3abcc962efe/git-legacy-rebase.sh#L580). The new code [is very similar](https://github.com/git/git/blob/8858448bb49332d353febc078ce4a3abcc962efe/builtin/rebase.c#L1275). – torek Nov 10 '18 at 02:38