105

I have some .sql files that I just for the first time pushed to github. However when I look at the commit it is saying:

BIN  WebRole/Sql/Database.sql View
Binary file not shown

Can someone tell me why it's saying "Binary file not shown"

Alan2
  • 23,493
  • 79
  • 256
  • 450
  • possible duplicate of [Why does git think my cs file is binary?](http://stackoverflow.com/questions/2506041/why-does-git-think-my-cs-file-is-binary) – Nick Grealy May 08 '15 at 01:51

6 Answers6

114

The extension alone isn't enough to GitHub to see if it is a text file.
So it has to look at its content.

And as mentioned in "Why does Git treat this text file as a binary file?", its content might not include enough ascii character to guess it is text file.

You can use a .gitattributes file to explicitly specify a .sql should be a text, not a binary.

*.sql diff

Update 2018: as I mention in "Utf-8 encoding not working on utf-8 encoded document", Git 2.18 .gitattributes has a new working-tree-encoding attribute.
So, as shown in Rusi's answer:

*.sql text working-tree-encoding=UTF-16LE eol=CRLF

As kostix adds in the comments:

if these files are generated by the Microsoft SQL Management Studio (or whatever it's called in the version of MS SQL Server's management tools you're using), the files it saves are encoded in UCS-2 (or UTF-16) -- a two-byte encoding, which is indeed not text in the eyes of Git

You can see an example in "Git says “Binary files a… and b… differ” on for *.reg files"

As mentioned in "Set file as non-binary in git":

"Why is Git marking my file as binary?" The answer is because it's seeing a NUL (0) byte somewhere within the first 8000 characters of the file.
Typically, that happens because the file is being saved as something other than UTF-8. So, it's likely being saved as UCS-2, UCS-4, UTF-16, or UTF-32. All of those have embedded NUL characters when using ASCII characters


As Neo mentions in the comments (and in Why does Git treat this text file as a binary file?):

You can change the encoding of a saved file in SSMS to UTF-8 by selecting encoding 'UTF-8 with signature' from the 'Advanced Save Options' menu item in the File menu.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • 20
    @Alan, if these files are generated by the Microsoft SQL Management Studio (or whatever it's called in the version of MS SQL Server's management tools you're using), the files it saves are encoded in UCS-2 (or UTF-16) -- a two-byte encoding, which is indeed not text in the eyes of Git. – kostix Jan 26 '15 at 08:05
  • @kostix good point. I have included your comment in the answer for more visibility, as well as a few other references. – VonC Jan 26 '15 at 08:33
  • 19
    You can change the encoding of a saved file in SSMS to UTF-8 by selecting encoding 'UTF-8 with signature' from the 'Advanced Save Options' menu item in the File menu. Source: http://stackoverflow.com/a/21170043/197591 – Neo Oct 29 '15 at 12:58
  • 2
    @Neo Good point. I have included your comment in the answer for more visibility. – VonC Oct 29 '15 at 13:23
  • What if I'm sure that the encoding of the file is UTF8 (though it could've been UCS2 at some point), and if i rename it it shows up as text, but if I keep the old name it still thinks it's binary? – Dan M. Nov 21 '16 at 23:20
  • @DanM. Not sure: try and ask that to Github support: I would be curious to read their answer. – VonC Nov 21 '16 at 23:27
  • 7
    Another neat trick, if you're running Git Bash in Windows and don't want to overwrite any changes you've made to the files, is just typing "dos2unix *.sql". That will convert all UCS2 files to UTF8, allowing git to recognize the text. – Slothario Feb 06 '17 at 23:22
  • It seems dos2unix changes the encoding to "Western European (Windows) - Codepage 1252". – tbfa Apr 27 '18 at 16:02
  • 1
    @thebfactor check the option '`iso`' of that command dos2unix to see if that helps: https://www.computerhope.com/unix/dos2unix.htm – VonC Apr 27 '18 at 16:54
  • This started happening for certain `pom.xml` files in our repositories. What has puzzled me is that initially the diffs were clear, who and how(might be one of the contributors) has changed the encoding. Is there a way I can simply update the encoding and get away with this? – Naman Nov 11 '20 at 04:26
17

From version 2.18, git has an option working-tree-encoding precisely for these reasons. See gitattributes docs.
[Make sure your git version (and all who'll use the repo) is at least as recent as 2.18]

Find out the encoding of the sql file eg with file

If (say) its utf-16 without bom on windows machine then add to your gitattributes file

*.sql text working-tree-encoding=UTF-16LE eol=CRLF

If utf-16 little endinan (with bom) make it

*.sql text working-tree-encoding=UTF-16 eol=CRLF
Rusi
  • 1,054
  • 10
  • 21
  • The fun part is if you have a good mixture of ASCII, Windows-1252 and UTF-16. It's interesting Git doesn't use `file` command, or the same logic as `file` command to guess the encoding when you specify `text`. I presume Git developers don't want to start down the rabbit hole of guessing encodings. Unless Git does use similar logic to `file` when you specify `text` in gitattributes. If you specify `working-tree-encoding=UTF-16LE` and some of your sql files are Windows-1252, I think you might hit problems as Git tries to convert to UTF-8 for internal use. – Jason S Jun 15 '23 at 12:29
  • @JasonS I'm not sure what you're saying. I'm guessing you're mixing [line endings](https://stackoverflow.com/a/59644154/3700414) with file encoding (this question). You should think of these two attributes as orthogonal – Rusi Jun 15 '23 at 12:38
  • No I'm talking about encoding, not line endings. Consider you have multiple developers writing sql (on Windows for SQL Server), and your `working-tree-encoding=UTF-16` gitattributes setting for all sql files, but one of your developers saves as Windows-1252 or even ASCII. Won't Git think the file is UTF-16, try to convert it to UTF-8 (for internal use), and possibly fail, depending on the content of the Windows-1252 or ASCII encoded file? I haven't tested this. – Jason S Jun 15 '23 at 13:10
  • @JasonS If you specify the encoding as X and supply a file that's actually in Y, you can be sure the file will be garbled. [Untested, from manual]. You are expected to respect your (own!) `.gitattributes`! `text` is a different matter: You can and must have CRLF on windows and LF on *nix. Thats the only case where different files on different OSes become unified in git – Rusi Jun 15 '23 at 13:22
11

Using the accepted answer from the linked question and a few other comments I came up with this as a solution to the issue, which is working and runs on Win10

$Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
Get-ChildItem -Recurse *.sql | foreach {
    $MyPath = $_.FullName;
    $Contents = Get-Content $MyPath
    [System.IO.File]::WriteAllLines($MyPath, $Contents, $Utf8NoBomEncoding)
}
Walliski
  • 55
  • 7
Carl
  • 1,782
  • 17
  • 24
4

For those struggling with this issue in SSMS for 2008 R2 (yes, still!), you can set the default encoding as follows:

  • Locate directory C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\SqlWorkbenchProjectItems\Sql

Locations may vary. This is the directory used by the default installation on Windows 7 64-bit.

  • In this location, add (or edit) empty SQL file SQLFile.sql.

This is used as a template for new .SQL files. Save it using the encoding you require (in my case, Windows-1252 with Windows line endings). The arrow to the right of the 'Save' button gives you a choice of encodings.

You need to co-ordinate encodings with your development team to avoid git and SSMS hassle.

Resource
  • 524
  • 4
  • 16
  • 2
    I found this file for SSMS 2012 at ```C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn\ManagementStudio\SqlWorkbenchProjectItems\Sql``` – Aaron D Apr 19 '16 at 21:04
  • 1
    And SSMS2016: `C:\Program Files (x86)\Microsoft SQL Server\130\Tools\Binn\ManagementStudio\SqlWorkbenchProjectItems\Sql` – Coxy Apr 26 '18 at 06:28
4

Here is a quick workaround that worked for me, using SSMS 2012. Under tools => options => environment => international settings, if you change the language from "English" to "Same as Microsoft Windows" (it may prompt you to restart SSMS for the changes to take effect), it will not use UTF-16 as the default encoding for new files anymore- all new files I create have Codepage 1252 (file => advanced save options) now, which is an 8 bit encoding scheme and seems to have no problems with Git Diff

John Smith
  • 7,243
  • 6
  • 49
  • 61
1

The way to resolve this issue is to force the file to use 8-bit encoding. You could run this PowerShell script to change the encoding of all .SQL files in the current directory and its subdirectories.

Get-ChildItem -Recurse *.sql | foreach {
  $FileName = $_.FullName;
  [System.Io.File]::ReadAllText($FileName) | Out-File -FilePath $FileName -Encoding UTF8;
}
Gyromite
  • 769
  • 6
  • 16
  • 2
    A solid strategy, however, this didn't remove the BOM marker for me, which is what git treats as binary. Instead, I used the answer to [Using PowerShell to write a file in UTF-8 without the BOM](http://stackoverflow.com/a/5596984/1366033) which uses `[System.IO.File]::WriteAllLines($MyPath, $MyFile, $Utf8NoBomEncoding)` – KyleMit Nov 21 '16 at 22:34