The short answer is a simple (albeit not very satisfying) no. In fact, git rerere
won't help either, for two reasons:
- It is just for in-file conflicts, not for "high level" or "tree level" conflicts.
- A reintroduction of the file won't cause a conflict at all.
That said, this "reintroduction of a file" claim is not quite right. What you're getting is the high level (or tree level) conflict mentioned in point 1 above. To understand this, we need to look at how rebase works as a series of git cherry-pick
operations, and thus at how one git cherry-pick
operation works.
(For what you can do, jump to the end of this answer.)
Capsule summary of rebase
I'm going to skip most of the rebase detail, and just note that git rebase
:
- enumerates some set of commits to copy;
- does a detached HEAD
git checkout
(or git switch --detach
, in Git 2.23 or later) of the place at which the copied commits should land;
- copies each commit, one at a time, as if by, or sometimes literally by, invoking
git cherry-pick
; and
- once all commits are copied, move the branch name around as if by
git branch -f
, then re-attach HEAD
to that branch name.
The result is that if we start with, e.g.:
I--J--K <-- ourbranch (HEAD)
/
...--G--H--L <-- updated-upstream
and run git rebase updated-upstream
, we get:
I--J--K [abandoned]
/
...--G--H--L <-- updated-upstream
\
I'-J'-K' <-- ourbranch (HEAD)
where I'
, J'
, and K'
are the copies of our commits I-J-K
made by the three cherry-pick operations. The original commits still exist (and can be recovered for a while, in case the rebase went badly). It's just that they're harder to find now, because the name ourbranch
now locates commit K'
—the new and improved(?) copy—instead of the original commit K
.
The keys to rebasing are to make sure that the set of commits enumerated in step 1 is correct, that the position in step 2 is correct, and that each copy in step 3 is correct. The branch name fiddling in step 4 is the most visible, but least important, thing in the process. This is because Git is all about commits; branch names just serve to find the commits. (The finding step is important of course! If we can't find a commit, what good is it? But there are other ways to find commits, and once we do find them, it's the commits that matter.)
A cherry-pick is a merge
Because a cherry-pick operation is a merge—though one with a twist—we should look at a normal merge first. Again, this is just a capsule summary that hides a ton of important detail, but we start with a series of commits like this:
I--J <-- branch1 (HEAD)
/
...--G--H
\
K--L <-- branch2
This diagram means that we are "on branch branch1", as git status
would say. The current commit is commit J
: that's the source for the set of files we have checked out in our working tree. The current branch is branch1
. The name branch1
selects commit J
(whatever its actual big ugly hash ID is).
Commit J
, like every commit, has a snapshot and some set of parents. Like most commits, it has just one parent, in this case commit I
. Commit I
has one parent, commit H
; commit H
has one parent, commit G
; and so on. Meanwhile the other branch name in the diagram—branch2
—selects commit L
. Commit L
has a snapshot and a single parent K
. Commit K
has one parent, H
. From this point on backwards—remember, Git works backwards—everything is the same as for branch branch1
.
All of this means that commit H
—which, like every commit, has a snapshot—is the best shared commit on both branches. Commits before H
, like G
, are also shared, but they're not as good because they're more-far-away, as it were, from the two branch-tip commits J
and L
. Git will find commit H
by starting at J
and L
and working backwards as usual. Since it's on both branches, and is the best such commit, Git will use commit H
as the merge base for a merge operation.
A merge, in Git, consists of doing two git diff
s, more or less. To do two git diff
s, we need three commits.1 This lets Git run:
git diff --find-renames <hash-of-H> <hash-of-J>
to compare what's in the snapshot for commit H
, vs what's in the snapshot for commit J
. That comparison tells Git what we changed, in branch1
.
Git then repeats this but with commit L
:
git diff --find-renames <hash-of-H> <hash-of-L>
to compare what's in H
vs what's in L
. That comparison tells Git what they changed, in branch2
.
If we now combine the two sets of changes, and apply the combined changes to the snapshot from H
—not the one from K
, not the one from L
, but the one from H
—this will add together our changes and their changes. The resulting combined changes, applied to snapshot H
, gets us a snapshot that keeps our changes but also adds their changes. Or, if you prefer, it keeps their changes but also adds our changes. The effect is the same either way, as long as there are no conflicts.
So, if all goes well here, Git keeps our changes and adds their changes, or adds our changes and their changes and applies those to the base, or however you wish to view it. The result is a new snapshot, ready to go into a new commit. Git makes this new commit on its own, and calls it a merge commit. Git remembers that it is a merge commit by making a commit with two parents, instead of the usual one:
I--J
/ \
...--G--H M <-- branch1 (HEAD)
\ /
K--L <-- branch2
That's a normal, non-conflicted merge. As with any commit, Git writes the new commit's hash ID into the current branch name, so that branch1
now selects new merge commit M
. Like all commits, M
has a snapshot: the snapshot is the result of applying both sets of changes, after using git diff
twice to find the two sets of changes. Unlike most commits, M
has two parents instead of the usual one. This means that when Git goes to look at the history, by working backwards, it has to work backwards across both "forks" here.2
1Think of it this way: git diff
always needs two commits. If we use the same two commits—if we run git diff J L
for instance—we just get the same diff. So to get two different diff outputs, we need at least three commits. We could use four, e.g., git diff I J
and git diff K L
, but that wouldn't actually help us get to our goal. We want to use git diff H J
and git diff H L
, using H
twice, hence we need three commits.
2The word fork here is meant to imply that there's actually something similar going on with GitHub forks. These are not the same thing, but since Git works backwards, if we have a history with a merge in it, Git will see the merge as a fork in the road. (And, as you may have heard, "When you come to a fork in the road, take it.") With a GitHub fork, the original dividing—H
forking to I
and K
—happens as people make new commits. The merging, if any, happens later.
The more curious thing, though, is that since Git works backwards, what we think of as a fork, Git sees as a coming-together. These form merge base points. What we see as a merge, Git sees as a fork!
Handling conflicts
The usual merge conflicts occur when we and they both change the same lines of the same file, but in a different way:
I--J <-- branch1 (HEAD)
/
...--G--H
\
K--L <-- branch2
Suppose that in file F in commit H
, there is a typo in the wrong word. We fix the typo, and they replace the word (or vice versa). When Git goes to merge our changes-to-file-F with their changes-to-file-F, Git will declare a merge conflict and leave us to fix up the mess.
We can do this by hand. We open the resulting work-tree file—which has both sets of changes in it, surrounded by conflict markers—and see that they fixed the typo by fixing the wrong word, so we keep their change and throw ours out by deleting our line and the conflict markers, leaving just their line. Or we can use a merge tool, which will generally show us all three files—the F from the merge base commit H
, the F from our commit, and the F from their commit. The exact method by which a merge tool does this depends heavily on the tool; we won't worry about this and will just assume that we get the right result.
Alternatively, we can use -X ours
or -X theirs
to pick our changes or their changes and ignore the other "side"'s changes. The drawback here is that we have to know whose change is right: we pick the -X
option at the time we run git merge
, before we see the conflicts themselves. If, sometimes, our change is better, -X theirs
won't work. If their change is sometimes better, -X ours
won't work either. Sometimes you might be sure that their change, or your change, is always going to be better; that's where the -X
option helps.3 If you cannot be absolutely sure, I recommend avoiding -X
and just resolving conflicts yourself.
3Remember that -X tends to be "backwards" during a rebase, though: -X theirs
means our original commit and -X ours
means ... something complicated. See What is the precise meaning of "ours" and "theirs" in git?
High-level conflicts
In talking about conflicts above, we were looking at specific lines of one particular file. But those are not the only kinds of conflicts we can get. Suppose that, in H
, there is no file named F. Instead, we write our own file F from scratch. It has stuff meant for one situation. They, meanwhile, write their own file F from scratch, and it has stuff meant for some other situation entirely. It's not correct to pick just our F, and it's also obviously not correct to pick their F. Perhaps, for instance, we should rename our F
to something else, so that we can just store both files. Or perhaps it makes sense to combine their file with our file.
The key as far as Git is concerned, though, is that there was no file F in commit H
at all. Git calls this an add/add conflict, and it means Git cannot resolve it on its own, not even with -X
. I like to call these high level conflicts because they get generated in a part of Git that happens before doing a low-level file merge. (The low-level file merge—at least the kickoff for it—is in ll-merge.c
, where ll
stands for Low Level.) Others like to call this a tree conflict as the parts of Git involved in finding it are looking at the file tree structures inside Git commits.
There are other conflicts that hit this same high (or tree) level code. That includes if you delete a file, and they modify it, or vice versa: that is, there's some file F in H
, and it's in one but not both of J
and L
but its contents in whichever of those two commits has it have changed. That means either we deleted F entirely, or they deleted F entirely. Whoever didn't delete F fixed / changed something in it. This is a modify/delete conflict, and as with the add/add conflict, Git will always stop with a conflict here.
The cherry-pick merge
When we (humans) run git cherry-pick
, we generally want to copy a commit. That is, suppose we have this series of commits on some branch:
...--o--o--P--C--o--...--tip <-- branch1
There is some child commit C
with parent P
. If we have Git run:
git diff <hash-of-P> <hash-of-C>
(or just git show hash-of-C
, which includes this diff) we'll see what changed between the snapshot in P
and the snapshot in C
.
Meanwhile, we're on some other commit, perhaps on some other branch entirely:
...--H--I--J <-- branch2 (HEAD)
We have discovered that the difference between P
and C
is just what we need to add, after our commit J
, to make a new commit C'
or K
or whatever we choose to call it. This will be a copy of C
, as it were. We would, in other words, like to have Git run the git diff P C
to figure out what changed, then find the same code in our commit J
and make that same change.
If all goes well, we'll get our commit C'
. Git will even copy the commit message from commit C
, so that git show
on our new commit will have the same log message. The difference between snapshot J
and snapshot C'
will be the same as the difference between P
and C
, except maybe for line numbers. So we'll call this new commit C'
to show just how close it is to C
.
But: how should Git know which lines match up? Maybe, in the P
-vs-C
, the change is really close to the top of the file, but in our version of the file in J
, there is a bunch of new stuff at the top of the file. Or, maybe in P
-vs-C
, the change is way down in the middle of the file, but in J
we don't have all that extra stuff, and it's up close to the top of the file.
What we need Git to do, then, is run git diff P J
. That will tell Git what's different from P
to J
, so that it can line up the P
-vs-C
changes.
But if we are going to have Git diff P
vs J
, and P
vs C
... that sounds a whole lot like git merge
, doesn't it? Suppose we have Git do these two diffs, and then treat commit P
as a merge base commit. To the snapshot in P
, Git will add all "our" changes in J
, so that we keep our changes. To the snapshot in P
, Git will add all "their" changes in C
, so that we gain their changes. That will give us the right combination, so that we keep what we had, but add their P
-vs-C
changes.
So this is what git cherry-pick
does. It treats their parent commit P
as the merge base, J
—our HEAD
—as our commit, and C
—their child commit—as their commit. All the -X
options work as before, with -X ours
meaning P vs J and -X theirs
meaning P vs C.
Now, in your commits, you have deleted some files entirely. They haven't. So P
-vs-J
will say delete this swath of files. In their commit C
, they may have changed one of their files. This is a modify/delete conflict, with your "side" of the merge having deleted the file, and their side modifying the file. Since this is a high-level / tree-level conflict, Git will stop, regardless of any -X
options. You will have to resolve this conflict, by confirming that you want the file deleted.
What you can do to make your life easier
You could write your own script, to be run in the case that git cherry-pick
(or git rebase
's cherry-pick) produces a merge conflict that includes modify/delete conflicts. You can find these conflicts by inspecting Git's index. See git ls-files --stage
output—note that it is very long—and look for cases where there is a file in stage 1 (the merge base, i.e., their P commit) and stage 3 (their C) but absent in stage 2 (HEAD, i.e., your commit J or equivalent). Resolve the conflict by deleting the two entries in stages 1 and 3. You can do this programmatically using a list of files you know you deliberately deleted, for instance. After that, git status
will tell you if there are any other conflicts remaining.
Annoyingly, there is no way to make Git run this script automatically on conflicts. However, if you have the script detect whether any conflicts remain and whether git status
tells you that you're in the middle of a rebase, you can have it run git rebase --continue
appropriately, which at least reduces everything to a single shell command.