75

I've seen several question asking how to make git treat a text file as binary, but haven't seen the opposite yet:

How do I change git's choice of treating a text file as binary? I have a text file where in some configuration strings an EOT and ETX is used to separate parts of the configuration parameters.

For example, the source code contains lines like this:

INPUT 'ScrollRemote[EOT]no[ETX]NumDown[EOT]0[ETX]CalcWidth[EOT]no[ETX]MaxWidth[EOT]80[ETX]FetchOnReposToEnd[EOT]yes[ETX].....'

I would like this file to be treated as text, not a binary, so that I can see a diff of line changes.

pkamb
  • 33,281
  • 23
  • 160
  • 191
RonaldB
  • 1,110
  • 1
  • 10
  • 19
  • 2
    Git should handle that just fine. Most of the git plumbing doesn't care about the distinction between "text" and "binary" at all, since they are treated both as simply sequences of bytes. What problem are you actually having? – Greg Hewgill Feb 20 '13 at 22:18
  • 6
    A change in a binary file results in a new copy being stored in its entirety if I remember correctly, and a change in a text file is stored as a difference. This particular file is a text file (source code), with a few lines like the line above. Having this treated as a binary file makes me lose the ability to see difference in the source code... – RonaldB Feb 21 '13 at 14:17
  • 2
    The Git storage algorithm doesn't care whether files are considered "text" or "binary", it just stores bytes in the repository. See my answer for more details. – Greg Hewgill Feb 21 '13 at 17:45

3 Answers3

89

The way files are actually stored inside the Git repository is not relevant to how they are treated when displayed. So the git diff program, when asked to compare two files, first obtains both complete files from the repository and then runs a difference algorithm against them.

Normally, git diff looks for non-printable characters in the files and if it looks like the file is likely to be a binary file, it refuses to show the difference. The rationale for that is binary file diffs are not likely to be human readable and will probably mess up your terminal if displayed.

However, you can instruct git diff to always treat files as text using the --text option. You can specify this for one diff command:

git diff --text HEAD HEAD^ file.txt

You can make Git always use this option by setting up a .gitattributes file that contains:

file.txt diff

The diff attribute here means:

A path to which the diff attribute is set is treated as text, even when they contain byte values that normally never appear in text files, such as NUL.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
6

Look at Git Attributes -- they may be able to help you by specifying that a certain file extension is to be treated as text.

Luna
  • 1,447
  • 1
  • 18
  • 32
Mark Leighton Fisher
  • 5,609
  • 2
  • 18
  • 29
  • So, if I understand the Attributes correctly, I can use it to define how to show the diff between two versions of a binary file. Git will still treat each version like a binary file and store them in entirety, correct? – RonaldB Feb 21 '13 at 15:34
  • Full documentation of Git Attributes is here |https://git-scm.com/docs/gitattributes – romaroma Jun 14 '16 at 07:34
  • 1
    @RonaldB Git always treats every file as binary and stores them in entirety-- it doesn't have a special text-file mode, like other version control systems. The differences in how it handles binary files vs. text files only appear when using the top-level "porcelain" commands `git show` or `git diff`-- for text files, it figures out what line endings are and does displays of diffs based on lines. But internally, every file is stored as the whole binary file, carefully compressed against other data in the repo to minimize wasted space across revisions. – Slipp D. Thompson Oct 03 '19 at 17:33
  • @SlippD.Thompson Although the storage itself might treat them identically, features like automatic line-ending conversion before storage aren't identical between the two, right? – endolith Mar 31 '21 at 21:03
3

If you are trying to compare or merge text files and git is saying they are binary files, they may just have different encoding (e.g. UTF-8 and ANSI). See the answer I gave on this post.

Community
  • 1
  • 1
deadlydog
  • 22,611
  • 14
  • 112
  • 118