I can't address anything about your IDE itself,1 but your comment here is correct:
So if I get this right: Whatever comes after 'git rebase' will be the destination, that (current) is rebased onto.
It's also important to remember here that branches—or more precisely, branch names—matter very little here. The rebase operation will, at the end, move the current branch name, but everything rebase does in between is all about commits, not branches.
One needs to remember at all times that Git is really all about commits. The commits are each numbered, with a unique but random-looking, incomprehensible hexadecimal string, that humans tend to appreciate being abbreviated, e.g., 41b65f
.2 Each commit:
- contains a full snapshot of all files (in a compressed, Git-only, de-duplicated format);
- contains some metadata, such as the name and email address of whoever made the commit.
The metadata in each commit include the hash ID of a previous commit, or sometimes multiple previous commits. These are the parents of the commit. This forms a backwards-looking chain, linking commits together, but backwards. So Git actually works backwards.
A branch name simply selects some particular commit, which we've designated as important in some way. From here, Git will work backwards. That makes the selected commit the last commit on the branch. The commits themselves are independent of the branch name: Git needs only the raw hash ID of each commit, to look up the commit. The name stores—and thus also provides—the raw hash ID of this "last commit", so that mere humans don't have to memorize big ugly hash IDs.
Moreover, as we do work, we get "on" some branch, using git checkout
or git switch
. That attaches Git's special name HEAD
to the branch name, so that Git knows which name we're using, and also extracts, from the commit, the read-only (and compressed and de-duplicated, readable only by Git itself) files that are stored in that commit, so that we can see them and work with them or update them as appropriate.
I like to draw this situation horizontally, with the latest commit—selected by the branch name—on the right:
... <-F <-G <-H <-- branch1, branch2 (HEAD)
Here we're using commit H
, which is the latest on both branches. We're using H
via the name branch2
: both names select this commit right now, but HEAD
is attached to the name branch2
. Commit H
contains inside itself—in its metadata—the raw hash ID of its parent commit G
, which contains the raw hash ID of still-earlier commit F
, and so on.
When we make a new commit, it gets a new, unique, big ugly hash ID. Let's call this I
instead of trying to guess what it would be. Git will:
- write out a new snapshot of files (de-duplicating as much as possible, so only truly-new-versions of files need new snapshots);
- write out metadata saying that we're the author of this new commit, and so on, and include in this metadata the real hash ID of commit
H
; and
- last, store the new hash ID
I
—computed by writing out all of the above3—and stuffing that new hash ID into the current branch name: the one to which HEAD
is attached.
So after making new commit I
, our picture is now:
...--F--G--H <-- branch1
\
I <-- branch2 (HEAD)
The special name HEAD
is still attached to the name branch2
, but the name branch2
now finds commit I
. Commits up through H
are still on both branches. The branches won't diverge unless and until we switch back to branch1
and do something there, such as make a new commit or two:
I--J <-- branch-X
/
...--F--G--H
\
K--L <-- branch-Y
Commits up through H
are on both branches, while branches X and Y have diverged.
What's actually important here are the commits. The branch names just help us find the commits. Of course that's important too—the commits can't matter if we can't find them! But the real information is (in) the commits themselves.
1I do not have or use that one. Indeed, I have a general allergic reaction to most IDEs and prefer command line operation, which I consider analogous to wanting to do woodworking in a shop full of tools, rather than with a Ronco WoodDoctor or whatever. I might, for instance, use this thing in a pinch, but I wouldn't use it for all-day-every-day work.
2This one is quite short: the standard shortest that Git produces is 7 characters long, and the shortest that Git will accept as input is 4. Git will find all internal objects that begin with whatever prefix you use, and if there is only one matching object, that's the one that gets selected. If more than one matches the prefix, you get an error about ambiguity, and modern Git shows you all the hash IDs that matched.
3This is where the real magic lies, in Git. The hash scheme used here is SHA-1, which is no longer secure but is still good enough for Git. Git is moving to SHA-256, though, and the transition is going to be messy.
Rebasing
This brings us to the idea of rebasing. Given:
I--J <-- branch-X
/
...--F--G--H
\
K--L <-- branch-Y (HEAD)
we might decide that we'd like things better if we could change commits K
and L
somehow, so that they come after commit J
.
Now, it's literally impossible to change anything about any commit (because of the hash ID tricks that Git uses). But what if we were to copy K
and L
to new and improved commits—let's call them K'
and L'
—that do in fact come after J
?
That is, we'll end up with:
K'-L'
/
I--J
/
...--F--G--H
\
K--L
where I've taken all the names away. Let's put the names back, but cleverly switch them around a bit:
K'-L' <-- branch-Y (HEAD)
/
I--J <-- branch-X
/
...--F--G--H
\
K--L ???
Since we use the name branch-Y
to find the two commits there, we'll find L'
and then K'
now, in Git's usual back-ass-ward fashion.
What happens to commits K-L
? Nothing, that's what: they're still in there and if you somehow remember their raw hash IDs, you can still find them. Git will retain them for a while,4 and then if nobody can find them for long enough, they'll "fall away" and vanish for real.
The tool that makes these copies-of-commits is git rebase
. Rebase itself is a pretty big and powerful tool, and inside it, it has a smaller tool that copies one commit at a time. This smaller tool is git cherry-pick
.
Rebase starts by using Git's detached HEAD mode, which we haven't mentioned before. It's pretty simple though. In detached HEAD mode, the special name HEAD
isn't attached to a branch name any more. Instead, HEAD
points directly to a commit, rather like a branch name:
I--J <-- HEAD, branch-X
/
...--F--G--H
\
K--L <-- branch-Y
The rebase command starts out by listing out the hash IDs of the commits to copy (in the right order, backwards for Git, forwards for you: K
, then L
). Then it uses this detached-HEAD trick to make HEAD
point directly to where you want the copies to go. Since you want the copies to go after commit J
in this drawing, that's where our detached HEAD ends up.
Next, for each commit in the list of commit hash IDs, Git runs one git cherry-pick
.5 Each cherry-pick operation is, technically, a merge operation inside Git, and can have merge conflicts, but if all goes smoothly, Git will do the "pick" step on its own and make one new commit. Since HEAD
is detached, this step just writes the new commit's hash ID into the name HEAD
directly:
K' <-- HEAD
/
I--J <-- branch-X
/
...--F--G--H
\
K--L <-- branch-Y
Commit K
has now been copied, via cherry-pick or some other mechanism (see footnote 5), to the new-and-supposedly-improved K'
. The difference between K
and K'
comes in two parts:
The stored snapshot in K'
is presumably different. However, comparing H
vs K
, to see what changed, will produce the same diff output—more or less at least—as comparing snapshot J
vs K'
. The more or less part here sometimes does some heavy lifting: the diff might be quite different if there were a lot of merge conflicts you had to resolve.
And of course, the parent of K'
is J
, not H
.
All of this is handled by the git cherry-pick
tool, so that rebase only needs to submit the correct hash IDs to git cherry-pick
, and have the correct commit checked out at that time. That's why rebase had to do a detached-HEAD checkout of commit J
, though.
Now that K
has been copied to K'
, rebase needs to issue another cherry-pick operation. This one will be asked to copy commit L
. Git will compare K
vs L
to see what needs to be imported as changes to K'
's snapshot. You can get merge conflicts here, as this is yet another merge. But if all goes well, Git will make a new snapshot L'
on its own. L'
s parent will be K'
since K'
is the current, or HEAD
, commit. The result will look like this:
K'-L' <-- HEAD
/
I--J <-- branch-X
/
...--F--G--H
\
K--L <-- branch-Y
That was the last copying step required, so git rebase
is almost finished: it now only needs to yank the name branch-Y
off commit L
and make it point to commit L'
instead, and then re-attach your HEAD
to branch-Y
:
K'-L' <-- branch-Y (HEAD)
/
I--J <-- branch-X
/
...--F--G--H
\
K--L ???
The rebase process is now complete.
So, this is where each of the pieces come from:
Rebase needs to know which commits to copy. These come from the current branch, by starting at the last commit and working backwards as usual. (That list then has to have its order reversed.)
Rebase needs to know where to put the copies of commits. That comes from an argument you supply: the name branch-X
, in this case, to tell it put the copies after commit J
. You can use a raw commit hash ID here: the name doesn't matter, only the commit hash ID actually matters. But humans are bad at commit hash IDs.
Rebase also needs to know where to stop listing commits to copy. This also comes from an argument you supply: the name branch-X
, in this case. By starting at commit J
and working backwards, Git can tell which commits are already "on" branch-X. Those commits won't get copied. So Git won't copy J
—though of course that wasn't on the to copy list either—nor I
. But, importantly, Git won't copy H
or any earlier commit, even though those commits are on branch-Y
. They're on branch-X
too, and that's what makes them not get copied.
Again, a raw hash ID would suffice for "what not to copy" as Git is only interested in the commit hash IDs. The only name Git needs is the name branch-Y
: that's the name that has to move at the end of the operation. But Git can get that from your current branch: you're on branch-Y
when you run git rebase
. Rebase always affects the current branch.
4The mechanism that keeps them around for "a while" is that they're semi-secretly findable through reflog entries. The reflog entries eventually expire, and then they become truly un-find-able. A maintenance program—a sort of janitor that Git calls git gc
—will discover the dead commits and get rid of them for real, at this point. (In fact, it's git gc
that runs the expiry tool—git reflog expire
—and the cleanup tools, git repack
and friends, but the individual tools are there too, if you need them. Git is a whole machine-shop full of tools. Most of them are even pretty good tools, pretty solid and reliable, although there's the occasional Ronco device like git stash
. Yeah, they're flaky and break a lot, but people like them for some reason.)
5Rebase is actually a very old tool that was modernized pretty recently. In older versions of Git—before Git 2.26—rebase uses something other than cherry-pick by default, and you have to add some options to make it use cherry-pick. Since cherry-pick is usually the right tool, upgrading Git is usually the right thing to do here. The older am
based back end still usually works, and does run faster, so you can still use it; or if you're stuck with an older Git version, you can pass options to git rebase
to get it to do cherry-picking.
Fancier rebase
Above, we ran git checkout branch-Y
and then git rebase branch-X
to do what we wanted, and it worked. The name branch-X
was where the copies went. It also made sure we copied just the two "only on branch-Y
" commits. So rebase cleverly used a single argument as both where to put the copies and what not to copy.
Eventually you'll find some situation where this is too clever and doesn't work for you. For instance, suppose you have this:
I <-- feature1
/
...--G--H <-- main
\
J--K <-- feature2
\
L <-- feature3
when you discover that feature3
is really part of feature1
after all. You now want to get commit L
over onto I
, as a copy L'
, but if you run:
git checkout feature3
git rebase feature1
Git will enumerate all the commits that are reachable from feature3
, but not reachable from feature1
. The first list goes L
, K
, J
, H
, G
, ...; the second list goes I
, H
, G
, .... Subtracting the second list from the first leaves L
, K
, and J
—which rebase will then flip to the right order so rebase will copy J-K-L
to J'-K'-L'
.
But that's not what you want. You only want L
copied to L'
: you want to leave J
and K
alone, on feature2
.
The rebase command therefore allows you to separate out these two things: you can run git rebase --onto target upstream
. In this case, the upstream
argument is the one that limits what gets copied, and the target
argument is where the copies go. Without --onto
, you run git rebase upstream
and upstream
provides both pieces of information.
So, in this case, you would run:
git checkout feature3 # what we want to copy
git rebase --onto feature1 feature2
The --onto
argument feature1
says *put the copies after I
; the upstream
argument feature2
says don't copy commit K
or anything earlier; and this copies just the commit(s) you want (if there are more commits than just the one L
we show here, they all get copied).
The final result is what you want:
L' <-- feature3 (HEAD)
/
I <-- feature1
/
...--G--H <-- main
\
J--K <-- feature2
\
L [abandoned]
Because Git is a big workshop full of tools, you can do this by checking out feature
directly and running git cherry-pick
yourself. Note that the result will be slightly different:
I--L' <-- feature1 (HEAD)
/
...--G--H <-- main
\
J--K <-- feature2
\
L <-- feature3
You can now delete the name feature3
:
I--L' <-- feature1 (HEAD)
/
...--G--H <-- main
\
J--K <-- feature2
\
L [abandoned]
If you have just the one commit to copy and you want to have the name feature1
identify it, this cherry-pick operation is actually slightly better. But if you have a dozen commits to copy, you probably want to use rebase; it's the power-tool of commit-copying.
There's a great deal about rebase that I have not covered here: in particular, the part where rebase builds up the list of commits to copy is much more complicated than the simple "here minus upstream" thing. There's something called the fork point code, which is clever (it handles certain "upstream rebase" problems pretty nicely) but a bit fragile (it relies on reflogs and can break). There's some magic where Git uses git patch-id
to try to figure out whether certain commits were already copied into the upstream as well. This code also works well, but can misfire on rare occasions. But rebase is a pretty good power tool. Just be sure to keep the finger-catchers working, so that you can go to the repository surgeon and have your fingers reattached if you saw them off. :-)