How can a new file be owerwritten at a git rebase?

Question

So I had been working on a checkout out feature branch, let's call it feature-branch-development. I ended up with a considerable amount of commits on this branch, 15 or so. When it was finally time for me to rebase the master branch into feature-branch-development it stopped midway complaining about a new file that I added on feature-branch-development and gave me an error saying something like Your changes on this branch will be overwritten, please commit or stash them. I had of course already commited everything and my working tree was clean. It was as if git "believed" the new file also existed on master. Noteworthy is also that I had no merge conflicts at this point.

I solved the issue by squashing all my commits on feature-branch-development and rebased master again.

My question therefore is why git acted like this? I have an intermediate knowledge of Git and I'm merely asking this question to level my amount of knowledge.

I ran into the same problem. After reading your question I tried squashing my commits as you did and it fixed it for me as well, so thanks. — Tony, Jun 21 '18 at 21:12
Oops, well that deleted a file that had been ignored by .gitignore . So beware. I think that ignored file is what caused the conflict in the first place. See my answer. — Tony, Jun 21 '18 at 21:20

score 4 · Accepted Answer · answered Jun 15 '17 at 19:09

This would be easier to answer with a concrete example. It's simple enough in principle though: git rebase means: Copy commits, as if by git cherry-pick. It's the copy step that is failing here.

We start on branch R (for rebase) and pick some new target branch T (for Target). We then identify some set of commits to copy: these are, for the most part, commits that are on R but not on T. (In a bit of a white lie, the git rebase documentation suggests that these are the commits listed by git log T..R or git log T..HEAD. That's mostly true, except that there are three extra fiddles added, or maybe "subtracted" would be the word.)

Then, having listed the commit hashes to copy, Git does a detached-HEAD checkout of the target branch, so that the current (HEAD) commit is the same commit that branch T identifies. Note that we're not on branch T, we're just on the same commit. Now, for each commit whose hash ID we saved, Git effectively (and sometimes literally) runs git cherry-pick <hash-id>:

                    ....... HEAD
                    v
...--o--o--o--o--o--o   <-- T
            \
             A--B--C   <-- R

We run git cherry-pick A to copy commit A. If we call the new copy A', the result looks like this:

                      A'  <-- HEAD
                     /
...--o--o--o--o--o--o   <-- T
            \
             A--B--C   <-- R

Then we run git cherry-pick B to copy B to B':

                      A'-B'  <-- HEAD
                     /
...--o--o--o--o--o--o   <-- T
            \
             A--B--C   <-- R

We repeat until we have all the commits copied, and then as the last step, Git "peels the branch label" R off the original chain and pastes it at the end of the copied chain:

                      A'-B'-C'  <-- R (HEAD)
                     /
...--o--o--o--o--o--o   <-- T
            \
             A--B--C   <-- (only in reflogs and ORIG_HEAD)

Conflicts and other problems

Each copy—each cherry-pick step—is done using Git's merge machinery, or what I call the "verb form" of to merge (as opposed to the noun or adjective form, a merge or a merge commit—git merge normally makes a merge commit after doing the verb form to merge, and we are just using the first half of this). You get, e.g., an add/add conflict when you merge a commit that adds a new file with a change that also adds the same new file.

The most likely place to get such a conflict is when making copy A' from A, because the merge base of a cherry-pick operation is the parent commit of the commit being copied. A merge operation—the to merge verb—compares this merge base with two commits. In this case, the two commits are commit A itself, and the tip-most commit of branch T, i.e., the commit to which T points. Let's call this X, and mark the merge base with *:

                      ? [merge in progress] HEAD
                     /
...--o--o--*--o--o--X   <-- T
            \
             A--B--C   <-- R

If both commits A and X, compared to *, add a file path/to/file.txt, you get an add/add conflict. The cherry-pick stops, leaving a merge conflict in progress. You must solve it and tell rebase to resume.

What if A adds path/to/file.txt and it's not in commit X? Normally, that would be no problem: Git would just create the file and put it in the new commit. There would be no add/add conflict at all, just a perfectly good copy A' (and we would go on and do the rest of the rebase).

But perhaps file path/to/file.txt exists in the work-tree for some reason (e.g., as an untracked file, and perhaps ignored as well). In this case, Git can't just copy path/to/file.txt out of commit A, overwriting the work-tree version. You get an error message here, just as for a regular merge conflict. You must solve the problem and tell rebase to resume.

There are some additional cases here for files that are in the index/staging-area (and hence can be in various commits) but are marked --assume-unchanged or --skip-worktree. In newer versions of Git, the precise error message identifies whether the file that "would be overwritten" is truly untracked, or marked like this. (This might be true in older versions of Git as well but I do not remember off hand and have not checked.) This is why a concrete example is better: there are several different causes for this particular problem.

For completeness: the subtractions I mentioned

These are also covered (somewhat) in the rebase documentation, but it's worth mentioning the commits git rebase deliberately doesn't copy:

It cannot copy any merge. When using --preserve, git rebase will re-perform merges to try to reconstruct them, but it cannot copy the original merges. So normally it does not even try.
It omits merges that have an upstream equivalent (e.g., those that have already been cherry-picked). In the original diagram (with T and R), Git checks to see if the git patch-id value for A, B, or C match the git patch-id value for any of the commits on T to the right of the merge base commit. If so, Git drops the A, B, and/or C commits whose patch ID already occurs upstream.
If used with --fork-point, the rebase code runs git merge-base --fork-point to attempt to find upstream commits that were deliberately dropped. This usually works pretty well with remote-tracking branches, and usually works less well if you set your own local branches as the upstreams for your own branches. Using --fork-point is the default when rebasing on the upstream using the automatic mechanisms, so one must be careful here. See Git rebase - commit select in fork-point mode for more.

score 0 · Answer 2 · answered Jun 21 '18 at 21:24

Short answer: check for files you are ignoring in the new directory structure, either by .gitignore or, as @torek mentioned, through --assume-unchanged.

I ran into a very similar problem. In my case, I had a file that was being ignored somewhere in the new directory structure. Apparently .gitignore'd files live in the working tree, and that is what caused the conflict. I squashed my commits in a new branch, and that successfully rebased onto master without conflicts. Then I noticed a file was missing, and realized it was an ignored file. That file is now completely gone. Something about doing the squash got rid of it without warning.

How can a new file be owerwritten at a git rebase?

2 Answers2

Conflicts and other problems

For completeness: the subtractions I mentioned