37

Consider that a file (1.c) contains three functions and changes made by authors M and J. If someone runs git blame 1.c, he will get the following output:

^869c699 (M 2012-09-25 14:05:31 -0600  1) 
de24af82 (J 2012-09-25 14:23:52 -0600  2) 
de24af82 (J 2012-09-25 14:23:52 -0600  3) 
de24af82 (J 2012-09-25 14:23:52 -0600  4) public int add(int x, int y)  {
de24af82 (J 2012-09-25 14:23:52 -0600  5)    int z = x+y;
de24af82 (J 2012-09-25 14:23:52 -0600  6)    return z;
de24af82 (J 2012-09-25 14:23:52 -0600  7) }  
de24af82 (J 2012-09-25 14:23:52 -0600  8) 
^869c699 (M 2012-09-25 14:05:31 -0600  9) public int multiplication(int y, int z){
^869c699 (M 2012-09-25 14:05:31 -0600 10)    int result = y*z;
^869c699 (M 2012-09-25 14:05:31 -0600 11)    return temp;
^869c699 (M 2012-09-25 14:05:31 -0600 12) }
^869c699 (M 2012-09-25 14:05:31 -0600 13) 
^869c699 (M 2012-09-25 14:05:31 -0600 14) public void main(){
de24af82 (J 2012-09-25 14:23:52 -0600 15)    //this is a comment
de24af82 (J 2012-09-25 14:23:52 -0600 16) }

Now, if author A changes the position of the multiplication() and add() functions and commits the changes, git blame can detect the code movement. See following output:

$ git blame  -C -M e4672cf82 1.c
^869c699 (M 2012-09-25 14:05:31 -0600  1) 
de24af82 (J 2012-09-25 14:23:52 -0600  2) 
de24af82 (J 2012-09-25 14:23:52 -0600  3) 
e4672cf8 (M 2012-09-25 14:26:39 -0600  4) 
de24af82 (J 2012-09-25 14:23:52 -0600  5) 
^869c699 (M 2012-09-25 14:05:31 -0600  6) public int multiplication(int y, int z){
^869c699 (M 2012-09-25 14:05:31 -0600  7)    int result = y*z;
^869c699 (M 2012-09-25 14:05:31 -0600  8)    return temp;
^869c699 (M 2012-09-25 14:05:31 -0600  9) }
^869c699 (M 2012-09-25 14:05:31 -0600 10) 
^869c699 (M 2012-09-25 14:05:31 -0600 11) public void main(){
de24af82 (J 2012-09-25 14:23:52 -0600 12)    //this is a comment
e4672cf8 (M 2012-09-25 14:26:39 -0600 13) }
de24af82 (J 2012-09-25 14:23:52 -0600 14) public int add(int x, int y){
de24af82 (J 2012-09-25 14:23:52 -0600 15)    int z = x+y;
de24af82 (J 2012-09-25 14:23:52 -0600 16)    return z;
e4672cf8 (M 2012-09-25 14:26:39 -0600 17) }

However, if I try to run git diff between these two revisions, it cannot detect that functions change their location and gives the following output:

$ git diff -C -M de24af8..e4672cf82 1.c

diff --git a/1.c b/1.c
index 5b1fcba..56b4430 100644
--- a/1.c
+++ b/1.c
@@ -1,10 +1,7 @@



-public int add(int x, int y){
-       int z = x+y;
-       return z;
-}      
+

public int multiplication(int y, int z){
    int result = y*z;
@@ -13,4 +10,8 @@ public int multiplication(int y, int z){

 public void main(){
    //this is a comment
-}
\ No newline at end of file
+}
+public int add(int x, int y){
+       int z = x+y;
+       return z;
+}      
\ No newline at end of file

My questions are:

  1. How can I enforce detecting code movement in getting diff output? Is it even possible?

  2. Git diff can be applied with several options. For example --minimal, --patience. How can I apply those options here? I tried with one, but get the following error:

    $ git diff --minimal de24af8..e4672cf82 1.c
    usage: git diff <options> <rev>{0,2} -- <path>*
    

Can anyone suggest/give sample example how to add these options correctly?

Michael
  • 8,362
  • 6
  • 61
  • 88
Muhammad Asaduzzaman
  • 1,201
  • 3
  • 19
  • 33
  • 2
    Since Git now does exactly what you want above in more recent releases, it would reduce future reader confusion if you changed to the Accepted answer: Try [this](https://stackoverflow.com/a/47192896/8910547) out to see if you agree. – Inigo Apr 30 '20 at 09:40

3 Answers3

64

As of Git 2.15, git diff now supports detection of moved lines with the --color-moved option. It even detects moves between files.

It works, obviously, for colorized terminal output. As far as I can tell, there is no option to indicate moves in plain text patch format, but that makes sense.

For default behavior, try

git diff --color-moved

The command also takes options, which currently are no, default, plain, zebra and dimmed_zebra (Use git help diff to get the latest options and their descriptions). For example:

git diff --color-moved=zebra
Inigo
  • 12,186
  • 5
  • 41
  • 70
  • 2
    Is there anything similar for _GitHub_? – Boris Yakubchik Jan 18 '19 at 13:27
  • 1
    Any way to enable this by default? – David Schumann Dec 03 '19 at 11:46
  • 1
    @DavidNathan Yes, use git config to set diff.colorMoved – Inigo Dec 19 '19 at 10:43
  • 4
    Thanks! For anyone wondering, the command could be: `git config diff.colorMoved true --global` – David Schumann Jan 06 '20 at 16:22
  • for a specific color use git config --global diff.colorMoved plain – 8ctopus May 10 '20 at 09:11
  • 2
    @davidschumann correction: `git config --global diff.colorMoved true` (--global comes before option name) – Kyle Rogers Nov 06 '21 at 18:48
  • Can we use it with `--color-words` ? – alper Oct 25 '22 at 15:42
  • @alper It's trivial to test. – Inigo Oct 25 '22 at 17:59
  • It does not work corretly using `--color-words` : https://stackoverflow.com/questions/74197000/how-can-i-keep-color-moved-theme-while-using-word-diff-color-in-git – alper Oct 25 '22 at 21:08
  • @alper That is unsurprising as `--color-moved` operates on lines (see my description above) while `--color-words` operates on words. – Inigo Oct 26 '22 at 03:37
  • Ah got it. I wish there was a way to combine results of `--color-moved ` and `--color-words` – alper Oct 26 '22 at 07:30
  • @alper See README at https://github.com/git/git for how to submit feature requests. Don't just state your wish as above, but describe a coherent and useful way of doing so and propose it. If you can't, you'll then understand why they haven't already done it ;) – Inigo Oct 26 '22 at 09:40
  • "describe a coherent and useful way of doing so and propose it" Well said. I am not sure I am capable enough to implement patch for Git, its like attempting to travel beyond ocean using a small craft for me :) but I will work on it to understand the reason behind it – alper Oct 26 '22 at 10:53
  • @alper I don't think the hard part is the code (leave it to others) but describing how your proposal would work *logically*. Provide enough example cases to demonstrate. – Inigo Oct 26 '22 at 12:50
25

This was the best answer at the time it was written, but it is no longer accurate. In 2017, Git 2.15 upgraded its diff to do move detection. As explained in the now top voted answer, use git diff --color-moved

Original answer:

What you're running up against here is that Git largely stays out of advanced diffing like this. There's a reason Git allows configuration of external diff and merge tools: you'd go insane without their assistance. Beyond Compare and Araxis Merge would both catch this movement, as an example.

The general class of problem you're looking to solve is a "structured merge": Structural Diff of two java source files

You might have a bit more luck with git-format-patch than with git-diff in this case because the former provides more commit info, including author and commit message and also generates a patch file for each commit in the range you specify. Source: What is the difference between 'git format-patch and 'git diff'?

If you're looking for tips on detecting code moves generally, it's interesting to note that detection of code movement is explicitly not a goal of the all-powerful pickaxe. See this interesting exchange: http://gitster.livejournal.com/35628.html

If you wanted to detect who swapped the order, it seems your only option would be to do something like:

 git log -S'public int multiplication(int y, int z){
    int result = y*z;
    return temp;
 }

 public void main(){
    //this is a comment
 }
 public int add(int x, int y)  {
    int z = x+y;
    return z;
 }'

What you're looking for is git blame -M<num> -n, which does something pretty similar to what you're asking:

-M|<num>|
       Detect moved or copied lines within a file. When a commit moves or
       copies a block of lines (e.g. the original file has A and then B,
       and the commit changes it to B and then A), the traditional blame
       algorithm notices only half of the movement and typically blames
       the lines that were moved up (i.e. B) to the parent and assigns
       blame to the lines that were moved down (i.e. A) to the child
       commit. With this option, both groups of lines are blamed on the
       parent by running extra passes of inspection.

       <num> is optional but it is the lower bound on the number of
       alphanumeric characters that git must detect as moving/copying
       within a file for it to associate those lines with the parent
       commit. The default value is 20.

-n, --show-number
       Show the line number in the original commit (Default: off).
Joshua Goldberg
  • 5,059
  • 2
  • 34
  • 39
kayaker243
  • 2,580
  • 3
  • 22
  • 30
  • 1
    Thanks for the answer. Can we use the blame information to determine which line in a version come from which line in the previous version?. There is a --porcelain option available to use with blame. That provides line mapping information, although the output seems to me confusing. Can I use that to track line location? Could you please on this fact. – Muhammad Asaduzzaman Oct 10 '12 at 16:37
  • 1
    entirely different concept here. i'll look into the ```git blame``` info, but porcelain is best thought of as it pertains to toilets. namely, porcelain is the 'pretty' facade over the git plumbing. porcelain commands are all the ones you're familiar with: ```git add``` ```git tag``` ```git commit``` plumbing commands are the crazy dangerous ones you don't want to mess with. more info: http://stackoverflow.com/questions/6976473/what-does-the-term-porcelain-mean-in-git and http://www.tin.org/bin/man.cgi?section=7&topic=git it's worth mentioning, there are unfortunate, confusing exceptions. – kayaker243 Oct 10 '12 at 19:06
  • 1
    ```--porcelain``` in the context of ```git blame``` and ```git status``` are two examples of the unfortunate conflating of these terms. porcelain in this context is a machine-consumable version of the command. – kayaker243 Oct 10 '12 at 19:10
  • updated w the command to show you how to make blame show you the line number in the original commit, but i'd assume it to mean the last commit that actually modified the contents of the line. – kayaker243 Oct 10 '12 at 19:19
  • Thanks much. I ignored the -n option, which can help me to get the mapping with the previous line. --porcelain can also provide the same information but that requires quite a bit parsing. – Muhammad Asaduzzaman Oct 11 '12 at 23:52
  • 2
    this answer is very helpful. A follow up question is why git*hub* doesn't use a more advanced diffing tool. Doing a PR to someone where the PR just moves around a lot of code looks onerous to the reviewer, when in fact the code may change very little. – Tommy Jul 26 '16 at 18:20
  • 1
    This answer, though once right, is [no longer correct as of Git 2.15](https://stackoverflow.com/a/47192896/8910547). The change also addresses you point, @tommy. – Inigo Dec 05 '17 at 19:41
2

In this particular case, I don't think git diff is concerned about detecting code movement; rather, it's simply creating a patch that can be applied to transform the old file into the new file, which is what your git diff output plainly shows - the function is being deleted from one location and inserted in another. There are probably more succinct ways to output a series of edit commands that move code from one location to another, but I think git might be erring on the side of portability here - there's no guarantee the end user wound always use git apply or git am, so the patch is produced in a format that can be used even with plain patch.

twalberg
  • 59,951
  • 11
  • 89
  • 84