16

While refactoring source code, sometimes you need to move big blocks of text inside a file, or even to a new file. You create a branch refactored and commit away:

$git checkout master
$git branch refactored
$git checkout refactored
<move code around>
$git commit -m "refactored code"

However, people may commit on top of the old pre-refactor branch, changing the code that was moved:

$git checkout master
<change code that was moved elsewhere on branch refactored>
$git commit -m "bugfix"

On branch refactored, you then want to incorporate changes made in master:

$git checkout refactored
$git merge master
<giant merge conflict>

This leads to a large merge conflict. If there was a way to tell git that the content was simply moved, it should be possible to merge automatically.

The worse part is that, even after resolving the conflict and commiting it, git still can't use the resolution to figure out further merges:

<fix conflicts>
$git commit -m "merge master into refactored"
$git checkout master
<change more code>
$git commit -m "bugfix2"
$git checkout refactored
$git merge master
<another giant merge conflict>

Is this avoidable at all? I've tried git rerere and it can't resolve the conflicts here. Is there any way git can see moving a block of text as a actual move, instead of a deletion and insertion? If it can't, what's the best approach to minimizing merge conflicts, if you need to keep the two parallel branches for a while?

While this is easy enough for moving the contents of a complete file, I couldn't find information on moving only part of it, or moving inside the same file.

Also, if there's a solution for this, what would be the behaviour of git blame on the refactored code? Would it point to the refactoring commit, or ignore it? Is there a way to achieve the later?

In case anyone's interested, I've put a base64 encoded tar.gz of the (very minimal) repository I'm using for testing on pastebin

Potential Solutions

One potential solution might be performing the merge by applying a (automatically) edited patch with the changes in the pre-refactored branch. Is there software developed to do this? Using this approach I guess that, since this is transparent to git, git blame would point to the refactoring commit.

I've found the same question, applied to diff. There's no mention to any existing non-proprietary implementation, but there mention to a algorithm that tracks block movement

Community
  • 1
  • 1
loopbackbee
  • 21,962
  • 10
  • 62
  • 97
  • I suggest you to make more commit. When you change a lot of data, more commit you have, less problem get when merging the code. – sensorario Dec 12 '13 at 13:45
  • @sensorario that's a good suggestion in general. I'm not sure how you'd apply it to refactoring, though - moving one function/small blocks at a time is not going to help – loopbackbee Dec 12 '13 at 13:52
  • If you change files you have two different contents. Yes, also if you move a function from a file to another. If you want to minimize conflicts, make more and more commits. Please try. – sensorario Dec 12 '13 at 14:01
  • 1
    @sensorario I've already tried to move a block merely 3 lines long, and change the middle line in `master`. It still conflicts – loopbackbee Dec 12 '13 at 17:31
  • What about rebasing `refactored` on top of `master`? Will you still have the merge conflicts? – Atropo Dec 17 '13 at 17:40
  • @Atropo yes, rebasing shows the same conflicts. I've put a base64 encoded tar.gz of the repository I'm using for testing [on pastebin](http://pastebin.com/raw.php?i=EwZtQf7R) – loopbackbee Dec 18 '13 at 10:39
  • This is indeed a failing of git merge. The answer is someone hacking on a feature request for git merge or a replacement for it. – Chris Moschini Mar 21 '14 at 22:19

2 Answers2

2

It is not possible to avoid this extra effort, but then again that is how Git is supposed to work:

The other fundamentally smart design decision is how Git does merges. The merging algorithms are smart but they don't try to be too smart. Unambiguous decisions are made automatically, but when there's doubt it's up to the user to decide. This is the way it should be. You don't want a machine making those decisions for you. You never will want it. That's the fundamental insight in the Git approach to merging: while every other version control system is trying to get smarter, Git is happily self-described as the "stupid content manager", and it's better for it.

(From Wincent Colaiuta's blog)

aronisstav
  • 7,755
  • 5
  • 23
  • 48
1

Unfortunately you can't replace the built in git merge strategies as far as I can tell. This means you can't stop the conflicts, however you can use an intelligent tool to resolve them.

This one Semantic Merge looks interesting, it can also be used by git

abasterfield
  • 2,214
  • 12
  • 17
  • 1
    I've tried SemanticMerge on the example repository. Firstly, on linux, it introduces 80MB of dependencies - lots of packages from plasticscm and what appears to be mono. Merging has to be done manually using `git mergetool`, and the tool needs interaction ("Hit return to start") and uses a GUI (can't run it from CLI, AFAICT). Finally, it just failed to merge the conflicts in my example, with "parsing errors have been found, the trees could be inconsistent". I'm not sure the tool is actually doing anything, since I then get a error "Object reference not set to an instance of an object". – loopbackbee Dec 18 '13 at 11:12
  • To be fair, I've tried it on ubuntu 12.04 LTS (Precise Pangolin), which is not necessarily supported (`we guarantee support only for Debian 6.0, but this repository may be targeted also to newer Debian versions (like 7.0, for instance) and Ubuntu distributions (especially LTS versions)`) – loopbackbee Dec 18 '13 at 11:18