5

Sometimes when you drastically change a file, it triggers a rewrite:

yes | head -256 > pa.txt
git add .
git commit -m qu
truncate -s128 pa.txt
yes n | head -64 >> pa.txt
git commit -am ro

Result:

[master 79b5658] ro
 1 file changed, 128 insertions(+), 256 deletions(-)
 rewrite pa.txt (75%)

However this does not happen with smaller changes:

yes | head -128 > pa.txt
git add .
git commit -m qu
truncate -s64 pa.txt
yes n | head -32 >> pa.txt
git commit -am ro

Result:

[master 88ef937] ro
 1 file changed, 32 insertions(+), 96 deletions(-)

Can I run a command that will show the percent change regardless of the amount? I looked into git diff-tree, but again it seems to only show when the change is drastic.

  • `git diff --numstat ` will show you the number of lines added and removed, for each file modified between `commit1` and `commit2`. However, the `75%` you see above is a Git similarity index, which measures the percentage of lines _changed_ in the original file. This is a slightly different metric than what `git diff --numstat` will show you. – Tim Biegeleisen Jan 06 '16 at 05:48
  • git diff -B1 maybe (lowering the default 50 threadshold) – VonC Jan 06 '16 at 06:11
  • I have gotten pretty low dissimilarity index with `git -c "core.pager=less -SFR" diff -B1%/1%` – VonC Jan 06 '16 at 07:06
  • I used your last example, but with `truncate -s462 pa.txt`. Then `git diff -B1%/1% @~ @ | grep diss` gives me `dissimilarity index 10%` I use git 2.6.4 (I will check if that still works with git 2.7, released yesterday) – VonC Jan 06 '16 at 12:39

2 Answers2

4
git diff -U10000 | awk '
/^i/ {getline; next}
/^-/ {pa += length}
/^ / {qu += length}
END {printf "%.0f%\n", pa/(pa+qu)*100}
'
  1. Force full context with -U10000

  2. Filter out --- lines

  3. Filter in deletions and context lines

  4. Count bytes for each

0

With the latest git:

> git --version
git version 2.7.0.windows.1

I use:

git init dissimilarity
cd dissimilarity
yes aaa | head -128 > pa.txt
git commit -am qu
<remove a few lines>
yes n | head -32 >> pa.txt
git commit -am ro

Then a git diff -B1%/1% gives me:

> git diff -B1%/1% @~|grep diss
dissimilarity index 14%

I then proceeded to make an even minor change by manually editing pa.txt, removing a few lines, adding a new one:

> git diff @~
diff --git a/pa.txt b/pa.txt
index 7f9bf77..bf32d0b 100644
--- a/pa.txt
+++ b/pa.txt
@@ -107,13 +107,7 @@ aaa
 aaa
 aaa
 aaa
-n
-n
-n
-n
-n
-n
-n
+sss
 n
 n
 n

And even then, I still see a dissimilarity index:

> git diff -B1%/1% @~|grep diss
dissimilarity index 2%

2%!

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • This appears to break if the change is less than 400 bytes. For example `truncate -s400 pa.txt` `truncate -s0 pa.txt` works, but 399 or less will fail. Possibly related: http://github.com/git/git/blob/7548842/diffcore.h#L23 –  Jan 07 '16 at 00:44
  • @SarahManning I agree. Still: 2%! ;) – VonC Jan 07 '16 at 05:33
  • This answer is helpful, but not a comprehensive solution, as it does not work in all cases. It appears in my original question the problem was not the percentage, but the amount of bytes being changed was below the threshold –  Jan 07 '16 at 14:05
  • @SarahManning I agree, but come on... 2%! Just kidding. I don't think there is a "comprehensive" solution out of the box. Mine tries to illustrate a git-native solution. – VonC Jan 07 '16 at 14:05