21

After many trials, I got this simple test case scenario:

a --> b --> c --   (master)
 \              \
  --> d --> b' --> e   (branch)

Where:

  • b' is a cherry pick of b
  • e is a merge from master.

b' was done after c and c has modifications to same files as b (d probably doesn't matter).

e can easily look very unexpected.

Let's say all of 'em are dealing with same file "foobar.txt". This is how the file looks in each commit:

// ----------- a
foo

delme

bar

// ----------- b
foo

delme

new

bar

// ----------- c
foo

new

bar

// ----------- b'
foo

delme

new

bar

// ------------ e
foo

new

new

bar

Now, this was from my brief test just now, with this exact setup.

If you remove all spaces there, there is no such problem. Merge will just accuse a conflict, as I'd expect. But I don't think using any -X setting for spaces is what we're looking for here... Or is it?

Meanwhile, on my production code, which was the reason I began researching about all this, and which has not nearly as many blank spaces, I got to see e looking something like this instead:

// ----------- e
foo

delme

new

bar

All that happens with merge never accusing any conflict!

If git was to do any of its voodoo magical auto merge here, this is what I'd expect it to look like:

// ----------- e
foo

new

bar

But this also does not happen.


As a bit of a disclaimer...

I also tried reading the f manual, but I can't really understand too many points under merge strategies. Plus it doesn't really say what the resolve strategy is doing under the hood, for instance:

It tries to carefully detect criss-cross merge ambiguities and is considered generally safe and fast.

That says nothing.

The text about the default recursive is bigger, but I also couldn't extract enough info from it:

This has been reported to result in fewer merge conflicts without causing mis-merges by tests done on actual merge commits taken from Linux 2.6 kernel development history.

Reported? So we got 1 very heavy unit test and assumed by a few reports it's all right?

Well, it's all too vague to me.


I think I must be doing something wrong! So, how can I do it right?

What need I do to get back on merging with no worries?

Community
  • 1
  • 1
cregox
  • 17,674
  • 15
  • 85
  • 116

1 Answers1

28

The basic problem is that there is no formal model for what it means to do correct automated merges that do the right thing in every case. In fact, "the right thing" can differ for different use cases in ways which the merge algorithm has no idea about. There have been a variety of attempts to come up with a single, correct merge algorithm that always does the right thing (various Monotone merge strategies, Codeville, Precise Codeville, Darcs, and so on), and all of them fail in some way in real-world use cases.

So, for a real-world merge algorithm "it works pretty well on a real codebase with lots of merges" is about the best you're going to be able to do. This means that you should never blindly trust the outcome of a clean automated merge; while it may have merged cleanly without conflicts, that may not have done exactly what you expected. You still need to review what the merge did, and test the result.

My general approach is to try a couple of different merge options, like you did, to see if one of them produces the correct merge. If that doesn't work to get you the correct merge (or a merge that produces the appropriate conflict that you can resolve), then you should do git merge --no-commit, and fix up the merge as appropriate before committing it.

Brian Campbell
  • 322,767
  • 57
  • 360
  • 340
  • Awesome. Makes sense. Now, since you're around, there is an extra question there: can we set any configuration to prevent the current default git algorithm from acting so differently just because there are a few extra breaking lines there? And, since you mentioned `--no-commit`, I never adopted that before, but now I am wondering if it might be worth it to set it on the config for main branches, such as trunks and releases. What you think? – cregox Dec 04 '13 at 16:37
  • @Cawas I'm not sure it's worth it. The vast majority of merges work fine with the default settings; if you're encountering this problem a lot, you may be doing something odd like lots of cherry picking between branches which cause ambiguous clean merges like this; in that case, I'd try to reduce the amount of cherry picking rather than changing the default merge options. – Brian Campbell Dec 04 '13 at 17:09
  • 1
    @Cawas For your second questions, doing a commit does not mean that the code is pushed. Having a clean merge automatically do a commit generally isn't a problem; just make sure you review that commit, and revert it and do a `--no-commit` merge if it failed to produce a correct commit. Again, if this is happening so often that you feel like you need to set it as a default, then you might want to review how you're doing branching and cherry-picking. – Brian Campbell Dec 04 '13 at 17:14
  • It's not happening often at all. But for the few times it did happen in the past years, it makes me worry for all other times it didn't. I got some unknown bugs sometimes and I'd never suspect it could be coming from such minor detail on merging until now. And I'm asking all this because putting it all on trial isn't an easy task: it might take another few years of testing to see if other settings would worth it. I think `--no-commit` definitely [makes sense](http://stackoverflow.com/q/2850369/274502) for release branches and I'll already try that along with `--no-ff`. – cregox Dec 04 '13 at 17:54
  • If you suggest to review all merged commits, I don't see how that goes against adding a `--no-commit` to every merge - except it doesn't really enforces a review. So, adding them to trunk branches may help remembering to do it when it's most important. – cregox Dec 04 '13 at 17:55