18

I have a folder with lots of .cs files. Some of these files (for some reason) are tracked as binary and the git diff command doesn't work normally with them. I tried re-saving all these files to UTF-8 encoding, but it didn't help. I tried changing the directory, directory name, filename and file extension and all of these helped.

I also tried modifying the .gitattributes file to treat *.cs files as non-binary but it didn't help me:

*.cs diff=csharp

I need a way to set all these files as non-binary w/o changing their path or name. Is there such a way?

John Szakmeister
  • 44,691
  • 9
  • 89
  • 79
Creative Magic
  • 3,143
  • 3
  • 28
  • 47
  • Just `*.cs diff` should enable normal diffs even if Git thinks the files are binary. What do you see when you try to diff? – CB Bailey Nov 01 '13 at 08:32
  • If I do like you wrote I get Binary files a/PATH/FILENAME.cs and b/PATH/FILENAME.cs differ – Creative Magic Nov 01 '13 at 08:34
  • This means that the attribute hasn't been recognized. I created some scrambled data and got "Binary files ... differ" which turned to a normal diff when I did `echo '*.cs diff' >>.gitattributes`. Do you have anything else in your `.gitattributes` file? – CB Bailey Nov 01 '13 at 08:47
  • the contents of my .gitattributes file are following: "*.cs diff merge text" nothing else, the file itself is UTF-8 as well. Just to make sure I wrote some nonsense in the file and got an error when tried to git diff, at least I know it's reading the file – Creative Magic Nov 01 '13 at 08:51

3 Answers3

19

You can do this to force git to think it's text:

 *.cs diff

You'll want to make sure it actually is text though. Forcing Git to think your file is text when it actually isn't can cause extremely bad behavior in a variety of situations.

You may need to set a couple of other attributes too:

 *.cs diff merge text

The text is useful for EOL normalization. You might need merge if Git still thinks the files are binary at merge time.

However, the real question is "Why is Git marking my file as binary?" The answer is because it's seeing a NUL (0) byte somewhere within the first 8000 characters of the file. Typically, that happens because the file is being saved as something other than UTF-8. So, it's likely being saved as UCS-2, UCS-4, UTF-16, or UTF-32. All of those have embedded NUL characters when using ASCII characters. So, while your question says you did re-saved the files as UTF-8, you may want to check again with a hex editor. I suspect that they are not UTF-8, and that's the core of the problem.

John Szakmeister
  • 44,691
  • 9
  • 89
  • 79
  • Checked my file with SublimeText 2, the encoding is shown UTF-8, with (or w/o) the git attributes you've written I get the same response: username$ git diff diff --git a/PATH/FileName.cs b/PATH/FileName.cs index 3936d6d..a6730f9 100644 Binary files a/PATH/FileName.cs and b/PATH/FileName.cs differ username$ and renaming the file helps... – Creative Magic Nov 01 '13 at 08:42
  • If there Git *thinks* the file is binary you probably don't want to use `text`, though as it would seem that there are characters not being properly interpreted by Git (stateful encoding??), so forcing Git to do line ending mangling might produce undesirable results. – CB Bailey Nov 01 '13 at 08:43
  • @CharlesBailey Right, that's why I said "You'll want to make sure it actually is text though." But, I'll make it more clear that it can cause Bad Thing to Happen, if you force the situation. – John Szakmeister Nov 01 '13 at 08:49
  • 1
    @CreativeMagic then something else is happening. Look in your tree for another `.gitattributes` file that is setting up your file differently. Where is the `.gitattributes` file your setting and what for what path does the file that's causing you problems live? Somewhere in between there is likely another `.gitattributes` file that's causing the issue. – John Szakmeister Nov 01 '13 at 08:54
  • Ah, then maybe you were using the wrong name in the `.git/info/` folder. The name should be `.git/info/attributes` in the info folder. You still want to make sure that the file is not binary though. A UTF-8 file should not be detected that way. There's likely still something else happening here that's at the root of the issue (e.g., a bad gitattributes rule that is causing the `.cs` files to be treated as binary). – John Szakmeister Nov 01 '13 at 09:17
2

Use Notpad++ to change the encoding from anything other than an encoding with a Byte Order Mark (BOM). Currently, git sees these top BOM characters (\0xFF\0xFE) as the start of a binary file.

Notepad++ Encoding Menu

ergohack
  • 1,268
  • 15
  • 27
1

In Sourcetree version 3.1.2:

  • Tools > Options > Diff
  • Add an extra zero to 'Size Limit (Text)'
  • Add an extra zero to 'Size Limit (Binary)'
  • Click OK
  • View > Refresh (or F5)
RJFalconer
  • 10,890
  • 5
  • 51
  • 66