First: the most common source of this problem is end-of-line formatting changes. To avoid that, use --ignore-cr-at-eol
or similar. Meanwhile, to answer your final question:
Is git rebase supposed to apply the very first commit?!
Sometimes, yes.
There are seven or so keys to understanding git rebase
:
- Commits are essentially snapshots of all files.
- The hash ID of a commit is its true name: Git needs the hash ID to find the commit.
- A branch name just stores one hash ID. Whatever hash ID is in that branch name, that's the last commit on that branch.
- Every commit records its parent(s), specifically their hash IDs.
- No commit, once made, can ever be changed, so Git doesn't try to do that.
- The
git cherry-pick
command can turn a commit—a snapshot—into changes, by diffing the commit against its parent commit. This assumes that there is exactly one parent; see below for some special cases. Having turned a commit into "make these changes to these files", Git can now use those instructions to make the same changes to the same files in some other commit. Technically this is a full three-way merge, except that Git doesn't save the result as a merge commit (with two parents), but rather as a new single-parent commit. The end result is to copy a commit, sort of.
- Since changing commits—including their parentage—is impossible,
git rebase
doesn't do that: instead, it makes a list of commits that, presumably, you mostly like except for their ancestry. It then checks out some other commit and runs a sequence of git cherry-pick
operations, one per commit, from the list it made earlier. Then git rebase
takes the branch name that found the commits-to-be-copied and yanks it around so that it now finds the just-copied commits instead.
The end result of all this is just what you'd want, in most cases. You start with, e.g.:
C--D--E--F <-- feature (HEAD)
/
...--A--B--G--H <-- mainline-or-develop
for example, where each uppercase letter stands in for one of the big ugly hash IDs. You like the commits on feature
, except for one thing: they start out from commit B
. You'd like them more if they built on commit H
, like this:
C--D--E--F [abandoned: old and lousy]
/
...--A--B--G--H <-- mainline-or-develop
\
C'-D'-E'-F' <-- feature (HEAD)
By listing out the hash IDs of the original C-D-E-F
commits, then checking out commit H
and running four git cherry-pick
operations, and finally yanking the name feature
around so that it finds commit F'
—the new and improved copy of what used to be F
—git rebase
winds up getting you exactly what you wanted.
But suppose you have this:
A--B--C--D <-- br1 (HEAD)
E--F--G--H <-- br2
Here, there are no parent/child or ancestor/descendant relationships between commits on the "top row" (found via br1
) and those on the "bottom row" (found via br2
). If you run:
git rebase br2
now, Git will list out each of the four commit hash IDs A-B-C-D
. That includes the root commit A
, which has no parent.
Here's where the special case for cherry-pick comes in. Whenever we have something other than a single parent, cherry-pick needs some help with the idea of turning a commit into a set of changes. For a merge commit, Git needs you to tell Git which parent to treat as "the" parent temporarily. But for a root commit—one with no parent—Git can just assume that there's some kind of ur-commit ε
at the front, that has no files at all in it, and diff against the empty tree. All files in commit A
, as compared to empty commit ε
, are newly added.
This will give you add/add errors on every file if commit H
, onto which commit A
is being cherry-picked, has the same set of files as commit A
. This is not all that unusual: it can happen when someone uses history rewriting tools, for instance.
Note that there are fancier forms of the git rebase
command to handle some of these issues. For instance, suppose we'd like to copy just commits C-D
atop H
in our br1
-and-br2
setup. We can run:
git rebase --onto br2 br1~2 # or HEAD~2
The --onto
argument separates out the otherwise-overloaded required argument (which git rebase
calls the upstream, which is perhaps not the best choice of word here) so that it can be used to list the hash ID or other identifier that locates the first (or last, if you read history forwards, instead of backwards) commit not to copy.