116

When git does a commit it rewrites binary files with something similar to rewrite foobar.bin (76%). What is that %? Is it percent changed or percent retained from the older file. I know that git uses a binary delta for files, but I just don't know how much of a rewrite the % represents and it doesn't seem to be in the help page for git help commit.

Thanks!

Mus
  • 7,290
  • 24
  • 86
  • 130
dude
  • 1,321
  • 2
  • 10
  • 7
  • Could also be related to http://stackoverflow.com/questions/244639/git-thinks-i-am-rewriting-one-of-my-files-everytime-i-make-a-small-change – VonC Jun 25 '09 at 22:12
  • 14
    Git actually stores a complete copy of each commit for each file (as a "blob"). When you ask for a diff, Git retrieves both copies of the file in question and runs a diff at that moment. The actual diff is not stored anywhere. This doesn't really answer your question but points out that thinking of Git as storing "binary deltas" is not quite correct. – Greg Hewgill Jun 26 '09 at 07:06
  • Possible duplicate: https://stackoverflow.com/questions/13641857/what-does-it-mean-when-git-says-rewrite-or-rename-in-a-commit-message – Aaron Swan Nov 01 '21 at 17:23

3 Answers3

78

Its a measure of the similarity index. The similarity index is the percentage of unchanged lines. git thinks your file is text.

Martin Redmond
  • 13,366
  • 6
  • 36
  • 32
  • 11
    I believe the similarity index is unrelated to whether Git thinks the file is text. Not certain on that, as some binary files can look a lot like text. – Greg Hewgill Jun 26 '09 at 07:08
30

I believe Martin is correct, that number is the similarity index. From the git-diff man pages:

The similarity index is the percentage of unchanged lines, and the dissimilarity index is the percentage of changed lines. It is a rounded down integer, followed by a percent sign. The similarity index value of 100% is thus reserved for two equal files, while 100% dissimilarity means that no line from the old file made it into the new one.

First time I saw the number I thought my binaries were changing dramatically!.

Daniel Gill
  • 3,261
  • 2
  • 23
  • 21
  • 2
    So if I see a rename followed by 67% then I am to interpret that as the file had not only been renamed but also that 67% of the original file is still present? Granted that could mean someone edited 33% of the original file during the rename. Would it be more beneficial if the rename was a single commit and then the edit of the file happened? Would that give an output of the rename similarity index to be 100%? That in theory would imply that the rename was successful and no data was lost? I saw this for the first time today and I want to make sure I understand it's intended use. – Eric Jun 23 '16 at 17:20
-8

It is attempting to rewrite CRs and LFs into a consistent format. That is, it doesn't see your binary file as binary. To force git to do this properly put the following line in .gitattributes:

*.bin -crlf -diff -merge

From this page that means:

all files with a [.bin] extension will not have carriage return/line feed translations done, won't be diffed and merges will result in conflicts leaving the original file untouched.

Talljoe
  • 14,593
  • 4
  • 43
  • 39
  • 17
    This is not what "rewrite" means in the context of the question. Git is saying "hey it looks like you rewrote this file but left 76% of it the same as it was before". – Greg Hewgill Jun 26 '09 at 07:05