10

I've recently been reading up a bit on .gitattributes and also found places like this one, https://github.com/alexkaratarakis/gitattributes, where they try to maintain gitattributes for all file types. However in my mind, looking through those files, I instinctively think this is an unmaintainable mess. It means you'd have to update that file any time you use any new file extension, or any software brings out a new file extension, which is just impossible. When you're working with a team of 30+ people it's just a nightmare to maintain some file like that, we can barely maintain a simple icons.svg file.

But along with that I have been coding and using git for many years, on many different projects, and I've never used .gitattributes. We use things like prettier on our project which rewrites newlines to "lf" and we have devs on windows and things like this never gives any issues, vscode also never gives any issues with things like this. Git also automatically picks up binary files like pngs and automatically shows text differences for files like svg, I've never had to configure that.

So I ask the question, is it really necessary to have this file? Because it seems to me like it's signing up for a ton of maintenance that's completely unnecessary and that git is smart enough to figure out what it should or shouldn't do with a file.

  • 1
    If you use a sane system like Linux or macOS, you just don't need anything at all: Git doesn't mess with files and you don't tell it to mess with files. Use `.gitattributes` if and only if you have to have Git mess with files because users A through M use macOS and users N through Z use Windows. – torek Jul 22 '22 at 22:30
  • 2
    The big problem with trusting Git is that Git's interpretation of "what's binary" and "what's text" simply doesn't work for some (hard) cases. If it's been working for you, that's fine, but eventually you'll stumble across one where it doesn't. – torek Jul 22 '22 at 22:32
  • 2
    I can’t speak for everything in ‘.gitattributes’, but it can be helpful for avoiding people pushing files with duplicate contents, but different line ending (CR+LF vs LF only). It prevented a lot of headaches. – Jonathon S. Jul 22 '22 at 22:33
  • Could you please provide me with a concrete example, or exact file extension, of where this would actually give an issue? And when it does give an issue, what exactly is the issue? –  Jul 22 '22 at 22:52
  • Had you noticed that Windows development uses Git? I'm pretty sure they have more than thirty people working on it. Do you have some concrete evidence? I think that link you found is someone who's bought the make-every-little-step-explicit line, and hasn't noticed the sinker or the hooks yet. – jthill Jul 22 '22 at 23:02
  • @jthill lol! that's funny –  Jul 22 '22 at 23:07
  • 1
    Another thing not yet mentioned in the answers is the `diff` attribute, which allows to do useful things like looking at the history of a particular function in your codebase. – philb Jul 23 '22 at 02:22
  • Is there any advantage to using something like `*.css text diff=css` instead of it being just text? Or where exactly does diff make a difference. –  Jul 23 '22 at 02:26

2 Answers2

8

is it really necessary to have this file?

Yes, for any setting (eol, diff, merge filters, content filters, ...) related to Git you want any collaborator to the repository to follow.

This differs from git config which, for security reason, remains local (both because it can include sensitive information, or dangerous directives)

A .gitattributes is part of your versioned source code, and contribute to establishing common Git standard.
For instance, I always put (as in VonC/gitcred/.gitattributes):

*.bat   text eol=crlf
*.go    text eol=lf

Because no matter how your IDE/editor is configured, I need CRLF for my Windows bat script to properly run, and I prefer LF for Go files, which I edit on Windows or Linux. I always considered local settings like core.autocrlf an anti-pattern, best left to false.

But a .gitattributes can declare many other Git elements:

The .gitattributes file is not "mandatory", but a useful tool in the Git toolbox, one that can be shared safely in a project code base.

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
3

It depends. The most common uses for .gitattributes files are line ending handling, working-tree encodings, and Git LFS. If you're using Git LFS, then it's required for those files to be handled as LFS files.

Otherwise, if all you care about is line endings, it depends on your platform. If your project is Unix-only, then it's not required. However, if your project may be used across systems, it's typically helpful to have one to indicate which files are text (that is, should be subject to line ending conversion) and which are not. Git does often guess correctly, but it only looks at the beginning of the file, and in many cases, certain file types (notably PDFs) start with a large block of ASCII-compatible text and then include binary data, and Git will need help.

If you want to include things like shell scripts or batch files, you absolutely do need a .gitattributes file because POSIX shells don't accept CR as part of a line ending and batch files must contain CRLF. An eol=lf or eol=crlf is therefore required for reproducible behaviour.

Similarly, some people on Windows have tools that have not come into modern times (where we overwhelmingly use UTF-8) and still absolutely require their data to be in little-endian UTF-16 with BOM. For those programs, typically a working-tree encoding is important so that Git will internally store them as UTF-8 text and can do things like diffs and merges on them. It is the case that most editors and tools these days handle UTF-8 and LF just fine, which is probably why you haven't really seen problems.

I do strongly recommend at least a simple * text=auto if nothing else if your project will be used on Windows, because it means that people will not accidentally commit CRLF line endings in your text files and also that people will have the line endings they prefer when working across systems. It's a simple step that can make the experience with your project a lot better.

bk2204
  • 64,793
  • 6
  • 84
  • 100
  • 1
    Thank you very much for that great answer. I think if * text=auto solves most of the crlf issues then that's at least quite maintainable. Or if it's only for very specific edge case files like pdfs then it's not that bad. The link I found however did give me a scare because shit that's just too much. Would you also recommend using `git config --global core.autocrlf input` in combination with `* text=auto` or is that unnecessary? –  Jul 23 '22 at 00:24
  • 1
    I also found this, https://stackoverflow.com/questions/17832616/make-git-use-crlf-on-its-head-merge-lines/35474954#35474954, which if I understand correctly tells me git will automatically reserve and keep file endings. So the only time I can think this gives an issue is if you created i.e. a .py script on windows then maybe used it on linux, in which case it will preserve the crlf and break. Would `* text=auto` solve that problem? –  Jul 23 '22 at 00:25
  • 2
    `* text=auto` will solve that problem. Using `.gitattributes` is almost always better than using `core.autocrlf` because you don't have to rely on other people setting it correctly, and so I don't have a strong opinion about how people should set that since it's a matter of choice. – bk2204 Jul 23 '22 at 01:30