1

This question uses Git 2.7.0.windows.1, so some of the git commands might be outdated.

If my git merge command doesn't detect a renamed file, how can I tell git to manually merge the changes in two files that are supposed to be a single file, without starting the merge over and using a lower rename threshold?

Reproduction Steps:

git init
echo "//Hello world!" > hw.h
git add . && git commit -m "Initial commit"
git checkout -b someBranch
mv hw.h hw.hpp
echo "//Foobar" > hw.hpp
git add . && git commit -m "Change hw to HPP & change content"
git checkout master
echo "//Boofar" > hw.h
git add . && git commit -m "Change content of hw"
git merge -X rename-threshold=100% someBranch

You will get merge conflicts, but no conflicted chunks. That is, the only conflict/error you should get is:

CONFLICT (modify/delete): hw.h deleted in branchB and modified in HEAD. Version HEAD of hw.h left in tree.
Automatic merge failed; fix conflicts and then commit the result.

And git status --porcelain will show:

UD hw.h
A  hw.hpp

Normally, when merging, it's ideal to set the threshold for detecting renames low enough that renames are detected. Some people recommend 5%, for example. In my case, I'm doing a massive merge (>1000 files and about 9m LOC), and I had to raise the rename threshold high enough to avoid any "false positives"; literally one percent lower and I got a huge swath of falsely-detected renames (I know, duplicated code sucks). At the value I wound up using, I get only a small handful of missed renames, which seems like a better option.

TL;DR lowering the rename threshold is not an option for me; how can I, without starting the merge over, tell git to consider hw.h and hw.hpp to be a single file (with conflicts), rather than two files as shown above?

Community
  • 1
  • 1
Nick Giampietro
  • 178
  • 1
  • 8
  • 1
    By the way, I feel your pain. I tried to do that kind of merge once before. We ended up using a different porting strategy. :-) – torek Mar 30 '17 at 19:07

1 Answers1

5

The tools for this are a bit klunky, but they are there.

You need to be sure that the merge itself stops before committing. In your case this happens automatically. For trickier merges, where Git thinks it's doing it correctly but is not, you would add --no-commit, but this then affects the next few steps. We'll ignore that problem for now.

Next, you need to get all three versions of the file in question. Since Git stopped with a conflict, we're in good shape: the three versions are all accessible through the index. Remember that the three versions we care about are merge base, --ours, and --theirs.

If Git had detected the rename correctly, all three versions would be in the index under a single name. Since it did not, they are not: we need two names. (With the "Git thinks it did the merge correctly" case, the merge base version of the file is not in the index at all, and we have to retrieve it some other way.) The two names in your case here are hw.h and hw.hpp, so now we do this:

$ git show :1:hw.h > hw.h.base    # extract base version
$ git show :2:hw.h > hw.h         # extract ours
$ mv hw.hpp hw.h.theirs           # move theirs into place

(The renaming is not strictly necessary, it's just to help keep it all straight and nicely illustrated.)

Now we want to merge the one file with git merge-file:

$ git merge-file hw.h hw.h.base hw.h.theirs

This uses your configured merge.conflictStyle so that what's in the merged file looks just as you would expect, except that the labels on the conflicted lines are a bit different. I have diff3 set, so I get:

$ cat hw.h
<<<<<<< hw.h
//Boofar
||||||| hw.h.base
//Hello world!
=======
//Foobar
>>>>>>> hw.h.theirs

You can now resolve this as usual, rm the extra .base and .theirs files, git add the final result, git rm --cached hw.hpp, and git commit. (It's up to you when to git rm --cached hw.hpp: it's safe to do this at any point in time before the commit, but once done you can no longer get "theirs" from the index; see below.)

Note that the "ours" and "theirs" versions are also available through git show HEAD:path and git show MERGE_HEAD:path. To get at the base version without the index, we would have to run git merge-base HEAD MERGE_HEAD to find its hash ID (and then assume there's a single merge base as well1), and git show <hash>:path. This is what we must do if Git thinks it has done the merge correctly.

Note also that if you really want to—I imagine this would only be true if you wanted to use some other tool(s) you have, that require it—you can use git update-index to shuffle the entries around in the index, moving hw.hpp into slot-3 of hw.h so that it does show up as "theirs", and shows up that way in git status. For this particular example:

 $ printf '100644 bbda177a6ecfe285153467ff8fd332de5ecfb2f8 3\thw.h' |
     git update-index --index-info

The hash here came from git ls-files --stage and is the hash for hw.hpp. (You need a second step to remove the hw.hpp index entry.)


1Use git merge-base --all to find all merge bases. If there is more than one, you can either pick one arbitrarily (this is what -s resolve does), or try to merge all the merge bases into a virtual merge base. To merge two merge bases, you find their own merge base, and merge two bases as if they are branch tips, using that merge base. Recurse and iterate as needed—this is what Git does with the default -s recursive strategy—until you have a single merge base version of the file.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Might it be safer to say `$ cp hw.hpp hw.h.theirs # move theirs into place` so hw.hpp doesn't disappear? Or is it not risky to use `mv` because "their" hw.hpp can be recovered through git? – Nick Giampietro Mar 31 '17 at 15:36
  • One more follow-up question: if I _did_ want to capture the rename to `hw.hpp`, how would the first three commands change? – Nick Giampietro Mar 31 '17 at 15:42
  • 1
    @NickGiampietro: Yes, as long as the files are in the index, you can use `git show` or `git checkout-index` to extract them (and use `git ls-files --stage` to enumerate what's in the index, including hash IDs and stage numbers). To get the rename to occur you would `git update-index` the blobs with their stage numbers, or maybe even just use `git mv`, although I have not tried that and am not sure what happens to the various stages (I am only familiar with `git mv` of a resolved, stage-zero-only, file). – torek Mar 31 '17 at 20:10
  • One more follow-up: after doing this, I see in my target file all of the conflicts, as you said. However, when I run `git mergetool ` I get an error saying "No files need merging." Is this normal? – Nick Giampietro Mar 31 '17 at 22:11
  • 1
    I never use `git mergetool` but I imagine it looks at the three stages, which would be the reason to use `git update-index` to manipulate the stage entries. – torek Mar 31 '17 at 22:19