10

I'm using Git to track some matlab code. A toy example best illustrates the problem. The project so far looks like this.

    C
   /
A--
   \
    B

Contents of A are x=5

We make commit C, where the line is changed to x=6

We then make commit B, where our content becomes as below

if flag==1
    x=5
end

If we attempt a merge with the goal of the project looking like

    C
   / \
A--   D
   \ /
    B

with the merge result in D, we'll get a conflict, because the main line has been changed in both (indentation added in B, 5 changed to 6 in C).

Is there a best practice way for integrating indentation changes from one branch, and content changes from another branch, to get a merge result?

I've read about one strategy in https://stackoverflow.com/a/5262473/288545, and while that would avoid the conflict, it would discard the indent in favor of the content change (which is an improvement, but still makes for harder to read code).

I suppose I could just suck it up and not change indentation when writing my code. This makes it less readable, but isn't a huge deal in matlab. However, in python, indentation really matters, so how do python folks deal with it? This gets much uglier if there are large blocks of code that we later change to be inside of control structures, so the diff touches many lines and makes merge conflicts a huge headache.

Is there a merge strategy that will handle spacing changes and content changes separately, and then integrate them? I'd like the result of the merge to be

if flag==1
    x=6
end
Community
  • 1
  • 1
Daniel Kessler
  • 1,172
  • 17
  • 21
  • In python, we direct everyone to `PEP 8` which tells them to indent their code with 4 spaces. Then, since we're all writing code with 4 space indentation, we don't have to worry about changing the indentation any more ... ;-) – mgilson Sep 11 '12 at 19:36
  • 6
    @mgilson The problem still arises if you have code that was originally stand-alone (unindented), and then you embed it in a control structure, so it must be indented. Even if you are perfect about only doing 4 spaces, you'll still get a merge conflict, because one branch introduces 4 new spaces, and the other branch introduces content change. – Daniel Kessler Sep 11 '12 at 19:39
  • My comment was mostly joking. You're right of course. However, in that case, it doesn't have to be indentation either. You'll have a merge conflict whenever there are any incompatible changes to the same block of code. That's why git forces you to deal with these incompatible changes and merge them by hand ... of course, there are a lot of tools you could try to make this less painful with `git mergetool` ... – mgilson Sep 11 '12 at 19:44
  • @mgilson You're right that this would come up in any scenario where there are two sets of changes. However, indentation changes and content changes are not necessarily incompatible, though this appears to be the case with default options for recursive merging. Imagine a merge strategy that would separately ID spacing changes and content changes. You could then apply the spacing diff to the content diff, then apply the content diff, and you'd get a much better result. – Daniel Kessler Sep 11 '12 at 19:48
  • @mgilson I've just started to play with some of the mergetool options, but would be delighted if there's one that could sort out (or auto-resolve) conflicts that are just due to indentation versus actual content-based conflict. Of course, the distinction between indentation and "content" is superficial, but I think still useful. – Daniel Kessler Sep 11 '12 at 19:49
  • Most diff/merge tools have options to ignore changes in whitespace. – Roland Smith Sep 11 '12 at 19:58
  • @RolandSmith so would a typical mergetool give me the result I'm looking for at the end of the question? – Daniel Kessler Sep 11 '12 at 20:12
  • 2
    @DanielKessler From A to D the _content_ of _every_ line changes even if you disregard the whitespace. How should an automatic merging program be able to know what you want? It looks impossible to me. – Roland Smith Sep 11 '12 at 20:18
  • @RolandSmith thanks for your thoughts. The *output* of the merge results in the content of every line being changed. However, the input to the merge has one branch with content changes and one branch without content changes. Such a merge is trivial, but for the indentation (e.g. non-content changes). If you look at the answer I linked, http://stackoverflow.com/a/5262473/288545, there is a strategy that will ignore the non-content change in favor of a content change. If it could then apply the non-content change on top of the content change, I'd be all set. – Daniel Kessler Sep 11 '12 at 20:27
  • 1
    @DanielKessler There are no branches without content changes in your example... – Roland Smith Sep 11 '12 at 20:34
  • @RolandSmith Branch B changes the indentation of one line. It does add two lines around it, but if it makes the problem simpler, you can just ignore that addition. If we divide changes into *substantial* (e.g. content-based) vs whitespace (e.g. indentation), B introduces no *substantial* changes. – Daniel Kessler Sep 11 '12 at 20:40

2 Answers2

6

The key to solving your problem is treating whitespace cleanups and function rewrites as separate commits.

As someone who does a lot of branch integration, I can honestly say the two most annoying things for integrators are: 1) coders who like to reformat files they didn't author or functions they didn't rewrite, and 2) IDEs that reformat whole files simply by opening and saving them. Sure the files are more readable in both cases, but it completely defeats the point of version control: Commits should be intelligently sized, constructed, and reviewed. Their order should have meaning. Kitchen sink commits make for useless history, and history should be nothing but useful.

This philosophy implies "Don't dump whitespace changes in commits where they don't belong." If you're rewriting the touched function, sure, improve the spacing. Otherwise leave it to its own commit and make sure others working on that file know whitespace changes are coming. That'll save you hassle at integration time.

Additionally, you can avoid accidental whitespace errors (trailing spaces, tabs after spaces, and the like) with git's stock pre-commit hook. Issue this command in your repository to set it up:

mv .git/hooks/pre-commit.sample .git/hooks/pre-commit

It more or less issues git diff --check, which is also incredibly useful for detecting these types of problems.

Christopher
  • 42,720
  • 11
  • 81
  • 99
4

Some diff tools are better than others. I use KDiff3, and have it set up to be invoked automatically using git mergetool. It's still not perfect, but it can often detect merges of the sort you describe and automatically produce a sensible merge.

me_and
  • 15,158
  • 7
  • 59
  • 96