172

Mostly .gitattributes file has * text=auto. What is the purpose of text=auto in that file?

Remi Guan
  • 21,506
  • 17
  • 64
  • 87
Fizer Khan
  • 88,237
  • 28
  • 143
  • 153

3 Answers3

112

From the docs:

Each line in .gitattributes (or .git/info/attributes) file is of form:

pattern attr1 attr2 ...

So here, the pattern is *, which means all files, and the attribute is text=auto.

What does text=auto do? From the documentation:

When text is set to "auto", the path is marked for automatic end-of-line normalization. If Git decides that the content is text, its line endings are normalized to LF on checkin.

What's the default behaviour if it's not enabled?

Unspecified

If the text attribute is unspecified, Git uses the core.autocrlf configuration variable to determine if the file should be converted.

What does core.autocrlf do? From the docs:

   core.autocrlf

Setting this variable to "true" is almost the same as setting the text attribute to "auto" on all files except that text files are not guaranteed to be normalized: files that contain CRLF in the repository will not be touched. Use this setting if you want to have CRLF line endings in your working directory even though the repository does not have normalized line endings. This variable can be set to input, in which case no output conversion is performed.

If you think this all as clear as mud, you're not alone.

Here's what * text=auto does in my words: when someone commits a file, Git guesses whether that file is a text file or not, and if it is, it will commit a version of the file where all CR + LF bytes are replaced with LF bytes. It doesn't directly affect what files look like in the working tree, there are other settings that will convert LF bytes to CR + LF bytes when checking out a file.

Recommendation:

I would not recommend putting * text=auto in the .gitattributes file. Instead, I would recommend something like this:

*.txt text
*.html text
*.css text
*.js text

This explicitly designates which files are text files, which get CRLF converted to LF in the object database (but not necessarily in the working tree). We had a repo with * text=auto, and Git guessed wrong for an image file that it was a text file, causing it to corrupt it as it replaced CR + LF bytes with LF bytes in the object database. That was not a fun one to debug.

If you must use * text=auto, put it as the first line in .gitattributes, so that the later lines can override it. This seems to be becoming an increasingly popular practise.

Flimm
  • 136,138
  • 45
  • 251
  • 267
  • 4
    Why is everyone calls LF as Normal but not CRLF? is there any ref to prove it? – Yousha Aleayoub Aug 11 '17 at 13:43
  • 1
    @YoushaAleayoub What do you mean? – Flimm Aug 11 '17 at 14:33
  • 1
    @YoushaAleayoub if your `everyone` refers to `git-scm`, it's probably because they're developing a *nix package and thus using *nix newline character is _normal_. – Justin Moh Sep 13 '17 at 10:25
  • 9
    @YoushaAleayoub LF is considered as "normal" b/c it is common in many dev tools. Popular dev tools like `git-scm` coming from *nix. MacOS uses LF. Only Windows (considering main-stream OSs only) is using CRLF. This makes it harder for devs using *nix tools on Windows and for everyone when exchanging files. See also [Why CRLF](https://stackoverflow.com/questions/6521685/why-does-windows-use-cr-lf). – Roi Danton Jan 04 '18 at 09:04
  • 3
    @Flimm, can you explain the difference between `*.txt text=auto` and `*.txt text` please? I thought all 4 lines in your example above should have been `text=auto`, not just `text` after the file extension. KiCad footprint files, for instance (".kicad_mod" extension), are normalized using this line in their gitattributes file: `*.kicad_mod text=auto` (http://kicad-pcb.org/libraries/klc/G1.7/). – Gabriel Staples Aug 08 '18 at 15:37
  • 1
    "I would not recommend putting * text=auto in the .gitattributes file." why? https://www.git-scm.com/docs/gitattributes recommends doing this ("see sample around `echo "* text=auto" >.gitattributes`") – reducing activity Jul 16 '19 at 05:25
  • 1
    @MateuszKonieczny The reason why is explained later in the answer: "We had a repo with * text=auto, and Git guessed wrong for an image file that it was a text file, causing it to corrupt it as it replaced CR + LF bytes with LF bytes in the object database. That was not a fun one to debug." – Flimm Jul 17 '19 at 10:09
  • 5
    @YoushaAleayoub Don't confuse "normalization" / "to normalize" with "normal". "To normalize" means to make uniform and "normalization" is the corresponding process. The resulting uniform state is called "normalized", not "normal". For unification, one has choose (and stick to) some convention. Git's convention (for the _object database_, not necessarily for the working directory) is `LF` line endings. See the [Mind the End of Your Line](https://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/) blog post. – das-g Apr 08 '20 at 11:20
  • 1
    (So we can assume no implication that unnormalized is "not normal" or even "abnormal" is intended in the Git documentation.) – das-g Apr 08 '20 at 11:23
  • 1
    @RoiDanton >"MacOS uses LF". This is provably not true. Mac OS has always used CR. Only with Mac OSX, Apple has broken its standard and and switched from CR to LF. – Ark-kun Apr 04 '22 at 07:55
  • 1
    @Flimm Can you explain the difference between `*.js text eol=lf` and `*.js eol=lf`? – Changdae Park Apr 12 '22 at 04:00
  • I suspect the reason it's becoming popular to add `* text=auto` to `.gitattributes` is that GitHub has been recommending it in their documentation: [Configuring Git to handle line endings](https://docs.github.com/en/get-started/getting-started-with-git/configuring-git-to-handle-line-endings?platform=mac#example) Unfortunately they don't mention the risk that Git could miscategorise a binary file as text. – sengi Apr 21 '23 at 01:04
  • "there are other settings that will convert LF bytes to CR + LF bytes when checking out a file" What are those settings? I suspect it makes sense to use both of these together. – pooya13 Jun 30 '23 at 21:40
70

It ensures line endings are normalized. Source: Kernel.org

When text is set to "auto", the path is marked for automatic end-of-line normalization. If git decides that the content is text, its line endings are normalized to LF on checkin.

If you want to interoperate with a source code management system that enforces end-of-line normalization, or you simply want all text files in your repository to be normalized, you should instead set the text attribute to "auto" for all files.

This ensures that all files that git considers to be text will have normalized (LF) line endings in the repository.

Community
  • 1
  • 1
Dave Zych
  • 21,581
  • 7
  • 51
  • 66
  • 14
    What you mean by Normalized line ending? – Fizer Khan Jan 31 '14 at 05:32
  • 16
    `When a text file is normalized, its line endings are converted to LF in the repository.` – Dave Zych Jan 31 '14 at 05:32
  • 12
    Important to know, this overwrites the local core.autocrlf setting on your machine see [this great answer by @Daniel Jomphe](http://stackoverflow.com/questions/170961/whats-the-best-crlf-carriage-return-line-feed-handling-strategy-with-git) – spankmaster79 Mar 17 '15 at 09:04
  • 2
    It would be awfully nice if git simply did not $%# with any of the files being checked in to the repository. I"ve worked with SLM, PerForce, MsBuild, Source Depot, TFS, SVM, none of these will change even one byte in any of your files. This is an insidious git hack IMO and it has caused me a lot of pain. – Vance McCorkle Nov 02 '18 at 02:33
  • 1
    What happens on checkout is only half the story - what happens upon a get? Would it be right to say that on checkout, line endings stay as `LF`, even on windows? – Anthony Jan 22 '19 at 17:11
  • 1
    @spankmaster79, .gitattributes OVERRIDES core.autocrlf, not overwrites. – M. Thompson May 28 '21 at 11:30
9

That configuration is with regard to how line endings are handled. When enabled, all line endings are converted to LF in the repository. There are other flags to deal with how line endings are converted in your working directory. Full info on the issue us here: https://www.kernel.org/pub/software/scm/git/docs/gitattributes.html

Karl Zöller
  • 143
  • 8