23

Several times in the past, we have had a problem where developers have (either explicitly or effectively) done a git merge -s ours when they shouldn't have, commits have been lost and we have only detected this much later when the cleanup is much more complex and time consuming. Tests didn't fail because the very commits which added the tests were silently reverted along with the code they were testing.

What I would like to do is find or create a tool which summarises a merge, producing an output something like the following with an normal merge:

Lines              Left(2b3c4d) Right(3c4d5e)
Common with base   970          930
Unique wrt base    20            50
Unique wrt other   15            45
Unique wrt merge   15            45
Common with merge  995          985

But in the case where a merge was incorrectly done, reverting out many changes, or where a git merge -s ours was performed, it might result in a report something like:

Lines              Left(2b3c4d) Right(3c4d5e)
Common with base   970          930
Unique wrt base    20            50
Unique wrt other   15            45
Unique wrt merge   15             0 !!
Common with merge  990          935
Warning: 100% of changes from 3c4d5e are missing from merge commit!

If this report were run for every commit, we could flag up (through a jenkins job) whenever a merge was a little on the smelly side.

So far I have been playing with git diff --stat, git diff --name-status and git diff --summary but so far none give me quite what I want.

The best I can do so far would result in something like the following for a normal merge:

              base..left   base..right  left..merge  right..merge
              f67c4..a9eb4 f67c4..5b592 a9eb4..cb209 5b592..cb209
 a    |                    1 +          1 +          
 b    |       1 +                                    1 +
 base |       1 +          1 +          1 +          1 +
changed       2            2            2            2
insertions(+) 2            2            2            2
deletions(-)  0            0            0            0

and for an ours merge:

              base..left   base..right  left..merge  right..merge
              f67c4..a9eb4 f67c4..5b592 a9eb4..95637 5b592..95637
 a    |                    1 +                       1 -
 b    |       1 +                                    1 +
 base |       1 +          1 +                       2 +-
changed       2            2            0            3
insertions(+) 2            2            0            2
deletions(-)  0            0            0            2

Note that I don't only want to detect a -s ours merge, I also want to catch the situation where some but not all changes are in the resultant merge. This is a more general case of detecting erronous merges than just checking to one specific cause of lost changes.

Also, this only seems to happen when there are conflicts in the merge, so any method which requires running the merge again automatically would also need to resolve those conflicts automatically too.

Finally, I would like to be able to run this summary utility on a dirty repo, without stashing all of my changes first, hence my current experiments with git diff.

Any suggestions as to how I can get at this sort of information more directly than a script with a lot of parsing and reformatting would be appreciated.

The closest existing questions I can find to this are: Detect a merge made by '-s ours' (but the only answer there doesn't help) and “git merge -s ours” and how to show difference (but there are no answers at all there).

Community
  • 1
  • 1
Mark Booth
  • 7,605
  • 2
  • 68
  • 92
  • 2
    An easy way to detect a `-s ours` merge is by running a diff against both parents of the merge commit. If one diff is **empty**, and the parents are otherwise distinct, it's a sure sign that `-s ours` (or equivalent) was performed. For example, you can loop over `$(git show --pretty=%P $commit)` and warn `if ! git diff -s --exit-code $parent..$commit`. – user4815162342 Jan 20 '15 at 18:02
  • The best way to see what happened in a merge IMO is to re-do the merge yourself, then diff the result with the actual merge. So `git checkout ; git merge ; git diff ` – Tavian Barnes Jan 20 '15 at 18:25
  • Thanks @user4815162342, that's pretty much what I'm doing above, using `git diff --stat` on `base..left`, `base..right`, `left..merge` & `right..merge`. Also, I don't want to assume a `-s ours` merge btw, I also want to catch the situation where some but not all changes are in the resultant merge. I've updated my question to make both more clear. – Mark Booth Jan 20 '15 at 18:30
  • Thanks @TavianBarnes, the problem is, we are only seeing problems like this in merge commits with conflicts, which would mean resolving all conflicts each time. I guess I could try `git checkout ; git merge -s recursive -X ours ; git diff ` and compare it to `git checkout ; git merge -s recursive -X ours ; git diff ` to find unique vs common though. – Mark Booth Jan 20 '15 at 18:35
  • Even with the conflict markers in place, `git diff` will show you how they were resolved. – Tavian Barnes Jan 20 '15 at 18:47
  • Did you see this answer to [How to Detect an Evil Merge](http://stackoverflow.com/questions/27683077/how-do-you-detect-an-evil-merge-in-git/27744011#27744011)? – Joseph K. Strauss Jan 26 '15 at 13:35
  • Thanks @JosephK.Strauss I didn't see that before. It's good to know that there is a term for what I'm trying to detect, even if the definition isn't completely nailed down. – Mark Booth Jan 27 '15 at 10:01
  • Actually, I think that what I want is an evilness heuristic, where 100% evil would be all changes from one side of the merge are discarded, while 0% evil would be all changed from both sides and no other changes. You could even have >100% evil where all changes from one side are discarded *and* there changes from neither. *8') – Mark Booth Jan 27 '15 at 16:41

1 Answers1

1

I had this exact problem, and ended up writing a tool to detect these types of merges. While I cannot share the actual code (it is technically owned by my employer), the algorithm is very simple: let S1 be the set of all files that have changes between the merge and the merge's second parent. Let S2 be the set of all files that have changes between the merge and the merge base. Subtract S2 from S1, and you are left with the set of files that probably have lost changes due to the merge.

This will not only detect merges made with -s ours, it will also detect botched merges where some, but not all, of the changes in the second parent made it into the merge.

David Deutsch
  • 17,443
  • 4
  • 47
  • 54