8

In exploring functionality in Subversion, I attempted to test the use case described in the Undoing Changes subsection of the Basic Merging section of the Branching and Merging chapter of the svnbook. I'm using version 1.6.4, but the text for that section is the same in both versions of the book.

In my working copy directory, I edit a file testcode.py, adding one line per edit, and committing after each edit. After several commits, the file reads as follows:

this is my first import to trunk.  r1.

this is my first commit, first edit of testcode.py.  r2.

this is another edit of testcode.py.  r3.

this is an edit of testcode.py.  i'll get rid of this one.  r4.

this is another edit of testcode.py.  keeping it.  r5.

yet another edit.  keeping it.  r6.

The revision numbers in the repository match up to the lines in the file such that in /trunk/testcode.py@rN, the last line of the file is the one ending with rN. What I want to do is remove the line ending in r4, keeping everything else before and after unchanged.

Following the example in the Undoing Changes section of the svnbook, I run the command

svn merge -c -4 file:///path_to_repos/trunk

This creates a conflict (upon running that command, not on commit), whereby the merge-left file contains everything up until line r4, and the merge-right file contains everything up until line r3. In other words, instead of removing a past change, the command seems to want to revert the entire file back to either revision 3 or 4, removing changes in subsequent revisions (5 and 6, in this case).

The way I read the example in the svnbook, which has the user reversing a change committed in revision 303 and committing the result to revision 350 with no conflicts, the command I ran should have produced a file with an svn status of M that retains all lines except the one ending in r4.

Am I reading the book's example incorrectly, is the example wrong, or is there some other form of user error I fell into unawares?

krosbonz
  • 185
  • 2
  • 8
  • Definitely reproducible. Now need to think why it happens. – malenkiy_scot Jun 26 '12 at 07:48
  • Creating a patch with `svn diff -c -4 foo.txt > foo.patch` and then applying it to `foo.txt@HEAD` works as expected - removes r4 line. – malenkiy_scot Jun 26 '12 at 08:23
  • 1
    So patches work, but "an extremely common use case for **svn merge**," as the svn book puts it, a straightforward one that got its own subsection, just plain doesn't. This does not inspire confidence in the behavior of Subversion's merge function in more complicated procedures, like reintegrating branches. – krosbonz Jun 26 '12 at 15:33
  • SVN merges are a pain in all major body parts - everybody who works with SVN knows it. I did not have enough time to check out what's going on in this case, but it is unexpected. My gut feeling is that it has to do with changes being done at the end of file, but, again, I have not checked it, yet. – malenkiy_scot Jun 26 '12 at 15:38
  • Ok, just checked what happens if the lines are added in the middle of a file - not at the end. Works fine - no conflicts. – malenkiy_scot Jun 26 '12 at 15:45
  • I tried reproducing that by adding and committing a line between those of r5 and r6, and still got a conflict. I added another before r4, and also got a conflict. What procedure did you use when yours merged cleanly? – krosbonz Jun 26 '12 at 16:18
  • The revision being reverted did not have anything added at the end - it was a new series of insertions, not the original one. – malenkiy_scot Jun 26 '12 at 18:34
  • Ah, I see. It works in that one case, meaning it is having trouble with any commits adding material after the line(s) needed to be removed. This all looks like a Subversion bug, to me. – krosbonz Jun 26 '12 at 20:50
  • Yes, I'm also thinking it may be a bug. Nice catch, by the way, bug or not. – malenkiy_scot Jun 27 '12 at 07:03
  • I was just about to ask that exact same question. What's going on here? – scherand Jul 19 '12 at 12:55
  • Might the comments to the question [Subversion: How to merge only specific revisions into trunk when multiple consecutive changes are made in a branch?](http://stackoverflow.com/q/326937/254868) shed some light on this? – scherand Jul 19 '12 at 13:49
  • I emailed users@subversion.apache.org, asking this question. I got an answer claiming that this functionality was intentional, and that a looser algorithm would allow the commit to go through without conflict. That didn't make sense, since no SVN options were the one claimed by the svn book, so I responded. The second response I got was similar, claiming that my test was a corner case, and that most code would work fine with this SVN command. I, however, still cannot reproduce the "remove an old commit" use case, and no one from the users group provided me with a working example. – krosbonz Jul 19 '12 at 17:02
  • To clarify, the second response mentioned the context issue in the question you referenced @scherand. – krosbonz Jul 19 '12 at 17:10

1 Answers1

4

The basic issue is that Subversion's diff algorithm handles changes at the beginning and end of files in a way that's not necessarily intuitive. Your example hits that corner case, while the majority of changes in the wild do not. Consider a file that looks like this after a series of commits:

later commit (r5)
change to be reverted at beginning of file (r2)
initial commit (r1)
change to be reverted in middle of file (r3)
initial commit (r1)
change to be reverted at end of file (r4)
later commit (r5)

Trying to revert the commits to the beginning or end of the file (revisions 2 and 4 in the example), gives a conflict. Reverting the change to the middle of the file works as expected.

Conceptually, it might help to think of changesets as having a scope limited by surrounding lines. A change to the middle of a file is bounded by the surrounding unchanged lines. The scope of a change at the beginning or end of a file extends all the way to the beginning or end of the file regardless of how far away that point is subsequently moved.

So in the example above, the second line added in revision 5 comes right in the middle of revision 4's scope. In the same way that you'd expect a conflict reverting revision 10 here because changes in revision 11 are smack dab in the middle of it:

...                    <-- Line unchanged by revision 10, bounding its scope
line from revision 10  <--\
line from revision 11     | Revision 10's scope
line from revision 10  <--/
...                    <-- Line unchanged by revision 10, bounding its scope

you should expect a conflict here, for the same reason:

...                    <-- Line unchanged by revision 10, bounding its scope
line from revision 10  <--\
line from revision 11     | Revision 10's scope
<EOF>                  <--/ (No unchanged line bounding the scope this direction)

Note that this is only meant as a conceptual explanation of why the beginning and end of the file are seemingly treated differently, not as a comprehensive explanation for understanding Subversion's merge process.

blahdiblah
  • 33,069
  • 21
  • 98
  • 152
  • Very interesting and 100% reproducible. Thanks! Did you find this by experimenting, did you write the diff algorithm, or is this documented somewhere? I might think about putting a "dummy" line in my files at the beginning and -- especially -- at the end now. – scherand Jul 20 '12 at 04:14
  • This is just from experience. I actually looked hard for documentation to include in this answer, but couldn't find anything. – blahdiblah Jul 20 '12 at 04:55
  • Just to make sure I fully understand, can I say the following? The first (nearest to the top) and the last (nearest to the bottom) line of a change set define its "scope". If the first line of the change is the first line in the file (or the last the last) the scope will extend to top/bottom "forever". And: if anything changed *within the scope* of that change set in a subsequent commit the original change set cannot be reversed without creating a conflict. True or false? :) – scherand Jul 20 '12 at 05:18
  • True, if the change set consists entirely of adjacent lines. If you change pieces at multiple points in a file, you get multiple scopes, not one big scope extending from the first changed line to the last. – blahdiblah Jul 20 '12 at 06:15