Git - check if some files were manually unstaged in git merge

Question

Scenario:

In a feature branch merge request against development, we can see that the feature branch reverts some changes from development, although there is no explicit revert or manual changes. This led us to believe that there might have been an incorrect merge somewhere in the feature branch.

We suspect that while merging the development branch in feature branch sometime earlier, one of the devs accidentally un-staged some of the merge changes.

Is there an easy way to verify if this was the case?

When checking the details of the merge commit, it only shows the files which were added to the commit, but obviously not the ones that were unstaged manually by the developer.

Staging comes before committing. The commit only records what was staged, so you can assume that, if a file is not present, it was not staged. I fear, that is how far you can go. — Daemon Painter, Dec 21 '20 at 08:42
I think you need to analyze the logs of feature and development branch to uncover the issue. — Mohit Natani, Dec 21 '20 at 08:44
It sounds like your history is very far from being linear so uncovering the bad commit would probably be not that easy. Did you inspect the output of `git show MERGE_COMMIT` and `git diff MERGE_COMMIT^1..MERGE_COMMIT^2`? — terrorrussia-keeps-killing, Dec 21 '20 at 10:30

score 0 · Answer 1 · answered Dec 21 '20 at 10:47

You are trying to check the commit content versus the theorical merge content. What you can do is dry run the merge to detect conflicts.

https://stackoverflow.com/a/63777004/5237940 provides a first hint to do this. You can adapt it to get the full list of changes that would have been performed during the merge:

git merge-tree <base> <commit or branch 1> <commit or branch 2> |grep -E -- ' their | our |changed |merged|added|removed'

This will give you an output like:

removed in local
  their  100644 5b194c882f6f8f0d0453b5df5a798d3164e432ed README.txt
changed in both
  our    100644 4825357e4088a083f0fee644c51c32a13d6a3df3 bar
  their  100644 0984a2a7a75695e942f0d16a89dd68a988c35c3a bar
changed in both
  our    100644 be89542ce50400c455c8715f0a2d3b12ee8d9a58 foo
  their  100644 ff14dbf9569ae99b880339588b3bcbf84687f5e6 foo

To simply get the file list:

 git merge-tree <base> <commit or branch 1> <commit or branch 2> |grep -E -- ' their | our ' | awk '{print $NF}' |sort -u

You get:

bar
foo
README.txt

Then you can use

git diff --name-only

to get the files modified by your real commit(s)

score 0 · Answer 2 · answered Dec 21 '20 at 16:46

You are essentially looking to identify whether some merge is an evil merge.

Currently, the only way to tell for sure whether someone made an evil merge in Git is to re-perform the merge. You do the same merge yourself, and you know whether or not you have done an evil merge. You can then compare your merge result to theirs. If you did a non-evil merge (a "good merge"? an "angelic merge"?) and your result differs from theirs, they must have done an evil merge.

This method requires a lot of human intervention so it's not really suitable for widespread use. You can, as Sousou noted / linked-to, use git merge-tree to determine whether the original merge had merge conflicts that had to be resolved. If not, Git itself could just do the merge automatically somewhere and compare that to the committed merge. This could be automated.

As the word could above suggests, Git doesn't actually do this now. There's a new merge strategy in the works (see this answer to When would you use the different git merge strategies?), that would act as a nice bit of enabling technology, and once it's in place, there are some vague but reasonably-straightforward-looking plans to offer this as an option (it will be somewhat compute-intensive but really quite handy, I think).

Even once all the above has landed, there's still one stumbling block. How big it would be in practice, I don't know. You'll see it if and when you use the "re-perform the merge" method manually. In particular, when someone runs git merge, they can provide options, such as -s ours or -X ours or -X find-renames=75. These options are not recorded anywhere. When you (or Git) go to re-perform a merge, what options should you supply? Fortunately, if you're redoing the merge by hand, you (presumably) know what options you should supply if any.

So, if you want to test for an evil merge, simply check out the first-parent commit of the two tip commits as a detached HEAD and run git merge options hash, supplying the desired options and the hash ID of the second tip commit. Resolve any conflicts (correctly of course); commit the result if you like; and then compare this result to the existing merge commit.

Note that git log already shows the (abbreviated) hashes of the two parents, but if you like, you can just run git rev-parse hash^@. The hat-at (^@) suffix means all parents of the given commit. They come out in order, so the first hash ID listed is the first parent of the merge, and the second hash ID listed is the other parent of the merge. (For an octopus merge, git rev-parse will list more than two parents, but this won't be an octopus merge.)

Git - check if some files were manually unstaged in git merge

2 Answers2