19

Before it comes up, I've already looked at several threads on this topic, including these:

I'm looking at switching to Mercurial (from Subversion) for our development group. Before I do so I'm doing the usual pros / cons list.

One of my "pros" is that merging is superior in Mercurial, but so far I'm failing to find compelling evidence of that. Namely, there is a statement made on HgInit.com that I've been unable to verify:

For example, if I change a function a little bit, and then move it somewhere else, Subversion doesn’t really remember those steps, so when it comes time to merge, it might think that a new function just showed up out of the blue. Whereas Mercurial will remember those things separately: function changed, function moved, which means that if you also changed that function a little bit, it is much more likely that Mercurial will successfully merge our changes.

This would be extremely compelling functionality, but as far as I can tell, it's just hot air. I've been unable to verify the above statement.

I created a Mercurial repository, did a Hello World and then cloned it. In one I modified the function, committed it and then moved it, and then committed it. In the other I simply added another output line in the function and committed.

When I merge, I get basically the same merge conflict I would get using Subversion.

I get that mercurial can track file renames better and that there are other advantages to DVCS besides merging, but I'm interested in this example. Is Joel Spolsky off-base here, or am I missing something?

I'm not experienced in this area, but it does seem like since Mercurial keeps more information that it could, in theory, do better at merging (if developers also did frequent checkins). For example, I see it as feasible for Mercurial to get contextual changes from comparing multiple changes, e.g., I modify a function, check in, move the function, check in, and Mercurial associates those two pieces.

However, Mercurial's merging doesn't seem to actually take advantage of the added information, and appears to be operate the same way as Subversion. Is this correct?

Community
  • 1
  • 1
JWman
  • 297
  • 1
  • 14
  • 1
    +1 Interesting - we're in the process of migrating our multi-gigabyte svn repo to mercurial, and one of the big selling points is easier merging. – Seth Feb 12 '11 at 20:21
  • possible duplicate of [Mercurial merge awesomeness - what am I missing?](http://stackoverflow.com/questions/4203309/mercurial-merge-awesomeness-what-am-i-missing) – Wim Coenen Feb 12 '11 at 21:03
  • 2
    I think the crucial bit in that particular quote is "*much more likely*". If you watch the video where Linus Torvalds portrays Git as the next best thing after sliced bread, he too makes the claim that Git can track functions that move around in the code base. I'm guessing that most of it is salesman hype since I've verified that both Mercurial and Git does not in fact manage to track code that move from one file to another. The code is tracked only as much as it is tracked as "deleted something here" and "added something else there". – Lasse V. Karlsen Feb 12 '11 at 21:48
  • Having said that, I wouldn't work with Subversion now after having used Mercurial. It's superior in merging even though it falls short on the particular scenario described here. So yeah, Mercurial is better than Subversion in merging, if you can follow the Mercurial way of doing things. You can mess up things royally though if you don't (but you can do that in Subversion too.) – Lasse V. Karlsen Feb 12 '11 at 21:49
  • Yes, that is the exact phrase that I have trouble with. In something like automated merging where there is no probabilistic approach, how is it "More likely" to succeed? In the end, merging changesets and merging the end result seems the same in Mercurial and in SVN – JWman Feb 12 '11 at 21:52

3 Answers3

6

As far as I know, anyone who says Mercurial tracks moving code around in less than file-sized chunks is just wrong. It does track file renames independently from code changes, so if Dave renames a file and Helen changes something in the file, it can automerge that, but as far as I know, Subversion can do that too! (CVS can't.)

But there is a way in which Mercurial's merge logic is dramatically better than Subversion's: it remembers conflict resolutions. Consider this history graph:

base ---> HELEN1 ---> merge1 --> HELEN2 -> merge2
     \--> DAVE1 ---/                    /
                   \--> DAVE2 ---------/

Helen and Dave made changes independently. Helen pulled Dave's tree and merged, then made another change on top of that. Meantime, Dave went on coding without bothering to pull from Helen. Then Helen pulled Dave's tree again. (Maybe Dave is working on the main development trunk, and Helen's off on a feature branch, but she wants to sync up with trunk changes periodically.) When constructing "merge2", Mercurial would remember all of the conflict resolutions done in "merge1" and only show Helen new conflicts, but Subversion would make Helen do the merge all over again from scratch. (There are ways you can avoid having to do that with Subversion, but they all involve extra manual steps. Mercurial handles it for you.)

For more information, read about the mark-merge algorithm which was developed for Monotone and AFAIK is now used by both Mercurial and Git.

Jon-Eric
  • 16,977
  • 9
  • 65
  • 97
zwol
  • 135,547
  • 38
  • 252
  • 361
  • 3
    I don't think this is correct, at least for v1.5 of SVN. According to http://svnbook.red-bean.com/en/1.5/svn-book.html#svn.branchmerge.cherrypicking, SVN DOES remember previous merges, even to the extent of allowing you to merge in changes from a specific revision in isolation (aka, cherry-picking) and then "completing" the merge later without redoing that work. – JWman Feb 14 '11 at 15:44
  • Also, Interesting link to the mark-merge algorithm. I haven't read through it all yet, but I will. However, on further digging, it seems Mercurial does not do _ANY_ merging on its own according to [this](http://hgbook.red-bean.com/read/a-tour-of-mercurial-merging-work.html#x_34e) – JWman Feb 14 '11 at 15:52
  • I confess to not having used SVN since about version 1.1, but I still pretty regularly see people complaining about problems with it that they wouldn't have if it was remembering previous merges. Maybe they're all still using 1.1 too. ;-) As for Mercurial and the link you posted, they haven't done a good job of explaining the difference between "resolving conflicts", which needs an external tool pretty universally AFAIK, and *deciding whether there is a conflict to be resolved*, which is the job mark-merge does. – zwol Feb 14 '11 at 16:04
  • Ah, that does make sense. I'm ashamed to admit my lack of understanding for how even SVN works for merging in spite of the fact that I've used it for years. But it seems that VCSs differ only on finding where merges need to happen, and then using external tools to do the job. I always thought they just used external tools for resolving conflicts (e.g. - KDiff). – JWman Feb 14 '11 at 16:24
1

AFAIK, SVN does all its merging internally - the merge tools are there only for cases where there's a conflict as (obviously) it needs to tell you about it and get you to fix it.

The non-conflicting cases are based around applying patches - ie, svn will take the changes you made in a revision, and will apply those to the target. Recent versions of SVN (since 1.5) remember the merges you did previously, storing this information in a property associated with the directory. 1.6 does a much better job of handling these properties compared to 1.5.

SVN does not merge by comparing 2 trees and diffing them - see the book - except when the trees to merge are not related, only then it'll perform a diff-type merge operation (or you specify the --ignore-ancestry option). Here's a brief description of what happens. You can see this when you merge past conflicts - once you've resolved a tricky revision merge, svn remembers which revisions were merged and will apply those changes again. you can prove this by trying it - branch, edit 2 files, merge to get a conflict. Edit the branched file only on the same line, then merge - it'll pop a conflict even though the target file hasn't changed, but because the merge is applying a change to a line that's been changed from what the branched file expected (ie just like patch which shows what it thinks the line its going to change should have been). In practice you don't see this as you don't tend to repeatedly reject your merge changes.

However, SVN does a poor job with renamed files as it tracks them as delete+add operations. Other SCMs do a better job - but even they cannot really tell if a file is renamed, or is deleted and added, especially when that file is modified as well. Git uses some heuristics to try and determine this, but I can't see it guarantees success. Until we have a SCM that hooks into the filesystem, I think this will remain the case.

gbjbaanb
  • 51,617
  • 12
  • 104
  • 148
1

Two things:

  1. Mercurial does have internal code to do merges and it will only call out to an external merge tool if the internal "pre-merge" fails.

    You link to the HG Book and it says that there is no built-in tool for handling conflicts (not the same as no built-in merge) and the Mercurial wiki where it is stated that Mercurial will try to do merges internally before calling external programs.

  2. You link to a question where my answer gives an explicit case where Mercurials succeeds and Subversion fails in a merge. This is an out-of-the-box comparison using the internal merge code in both Mercurial and Subversion.

Community
  • 1
  • 1
Martin Geisler
  • 72,968
  • 25
  • 171
  • 229
  • Thanks for clarifying. Yeah I goofed on my reading there a bit. But as far as point #2 I get it, mercurial handles renames better. But my understanding is that merging two files (in terms of content) is basically identical between SVN and Hg. The question is prompted by Joel's implication that mercurial somehow gleans more information from the process of retaining an entire changeset. In my mind it evoked the idea of incremental merges, where two changsets have a lot of incremental changes that provide enough context for mercurial to be somehow more intelligent about merging.But apparently not – JWman Feb 24 '11 at 03:54
  • No, there is no magic involved in Mercurial's merge machinery :-) – Martin Geisler Feb 24 '11 at 14:38