Edit (based on question-edit for more details): it seems possible that this is a case where rename detection has gone wrong, and possibly there's a criss-cross merge in the history as well, so that Git is forming a "virtual merge base".
To find out, first run:
git merge-base --all <hash1> <hash2>
on the two hash IDs that the two branch names master
and origin/master
pointed-to at the time the mis-merge happened. We need this to test out both whether rename detection is a problem, and whether there are multiple merge bases.
If this prints more than one hash ID, you are getting the recursive
strategy to merge the merge-bases, followed by using the resulting merge as a new merge base. This can (in rare cases) produce very confusing results. If so, switching to the -s resolve
strategy may help. This will pick one of the two merge bases and stick with it. (But you have no control over which merge base is used.) Note that setting merge.conflictStyle
to diff3
will also sometimes show you the effect of a virtual merge base (but only sometimes)
Next, whether or not there are multiple merge bases, we can check to see if rename detection is causing problems. If so, there are two things that may help:
- The sledgehammer approach: disable rename detection entirely during merge (requires Git version 2.8 or higher): add
-X no-renames
, which matches the spelling for git diff
(though there it's --no-renames
).
- The finer-tune-able tack hammer: raise the limit for detection (the default is 50%):
-X rename-threshold=100
or -X find-renames=100
requires an exact match instead of an approximate match. The new spelling, -X find-renames=<n>
, matches the spelling for the git diff
option and is new in Git version 2.8, but the option itself is very old, having been around since version 1.7.4. Note that other threshold values are allowed as well, although 100% exact match is quite notable.
To find out if Git is detecting renames, we need the merge base, which is a bit of a problem if there are multiple merge bases: we have to merge the merge bases first, to get a real merge commit that Git will use as the new merge base. I'll just assume that this is not the case, since that process is a bit messy; so we'll go on to look at "the" merge base, using:
git diff --name-status --find-renames=50 --diff-filter=R <basehash> <hash1>
git diff --name-status --find-renames=50 --diff-filter=R <basehash> <hash2>
The <hash1>
and <hash2>
values are the same as before. We tell Git to give us file names and statuses, and then print only the names of files whose status is R
(renamed). If Git does think some files are renamed, we will see their old and new names here. How Git combines these during a merge is a bit tricky, but the presence alone of R
-status files implies that Git will be doing this sort of thing. If there are no files, then it's not rename-detection after all.
(See this answer for a detailed description of rename detection in git diff
. The merge code uses different command line options, some of which have changed relatively recently. See VonC's answer to Disable Git Rename Detection as well.)
Original answer below.
In general, when merging, Git does not choose either "side". Instead, it takes both sides. Remember that there's a third side to this whole three-way merge thing: there's "your" side (HEAD), "their" side (what you're merging), and the base. This forms a triangle:
o <-- HEAD
...
o
...
...--o--B (base)
...
o
...
o <-- theirs
and the merge brings them all together to make a shiny (we hope) diamond:
o
/ \
o \
/ \
...--o--B o result
\ /
o /
\ /
o
See also this answer and this more technical / detailed answer.
Meanwhile, it turns out that at least some GUI interfaces present this fact to users. They get scared by the idea that there are many changes to many files, when they changed only one file. They instruct their GUI to undo all the other changes—which means throw away the other users' work! They then commit this, and you have to revert their merge to get the other users' work back.
(Another source of "touch every file" is when users enable end-of-line conversions in their setups. They take incoming code that uses LF-only or CRLF endings, and convert it to CRLF, or to LF-only, respectively. Then they commit all these changes, which means they have altered every line of every file.)