2

How is that possible? I have a .gitattributes with the following content

* text=auto

UPDATE: Huh. It turns out it's .gitattributes file that results in this behavior. Can somebody explain this or point to the relevant docs?

vehsakul
  • 1,118
  • 1
  • 10
  • 17

2 Answers2

3

After setting up a .gitattributes, use git add --renormalize . (from the directory that contains these *.sh and *.bash files) or git add --renormalize *.sh *.bash to update the files before committing. Or, use touch *.sh *.bash; git checkout -f *.sh *.bash to update the work-tree copies.

What's going on here

As you no doubt already know, a Git repository contains commits. Each commit has a frozen copy of each file that was committed, in exactly the state that it had when you (or whoever) committed it. This frozen copy can never be changed, so if it has CRLF line endings, it has them forever, and if it has LF-only line endings, it has them forever. Any other copy of that file, in any other future commit, can be different, but this copy in this commit is frozen. (Any copy in any other existing commit is of course also frozen, but could be different.)

Internally, each committed file is in a special, Git-only format, compressed and usable only by Git—and once committed, frozen forever in that particular commit. But of course, you can look at committed files, by extracting them; and you can make new committed files, working with the extracted files that you can modify. Thus, Git needs two operations:

  • copy a file from a commit, to where you can work on it; and
  • copy a file from where you have worked on it, to (be ready for) a commit.

It is these two operations that actually do any CRLF-to-LF-only or vice versa.

The place that holds the versions of files that you work with and work on is, perhaps unsurprisingly, called the work-tree (or some variant of this such as working tree or working directory). You use, and work on, files in your work-tree. You tell Git to copy files from a commit, to the work-tree, or to copy files from the work-tree, to (be ready for) a commit.

The index

There's an extra wrinkle in the way here, and that is that Git doesn't make commits from what's in your work-tree at all. Instead, Git inserts, between the commit and the work-tree, a third holding-area. Git calls this the index, the staging area, or sometimes the cache, depending on who / which part of Git is doing the calling.

Files in the index are always ready to be committed. That is, they have the same special, compressed, Git-only format that they would in a commit. That's the trick that makes git commit so fast (compared to other version control systems anyway): everything is, at all times—or almost all times anyway—ready to go. When you run git commit, Git doesn't even look at the work-tree. It just packages up the files that are in the index, in the form they have now, all compressed and Git-ified and ready to go.

The git add command copies files from the work-tree, into the index, making them ready-to-go. The git checkout command, by comparison, copies files from a commit—the one you're checking out—first into the index so that they're ready for the next commit, and then on into the work-tree.

This is why you need git add --renormalize

Suppose that some file is stored, in some way (with or without CRLF endings), in a commit. You run git checkout name to pick a branch and its tip commit. The files in that commit go into the index, and from there to the work-tree. The copy-out step—index to work-tree—changes the files to have the line endings someone told Git to use, probably through a .gitattributes file in the commit you just checked out.

If those are wrong, you now change the .gitattributes file.1 This would, perhaps, change the way the files should be in the next commit. It would, perhaps, change the way the files should be in the work-tree. But—here's the problem—Git already has the files the way it thinks is right, in both the index and the work-tree.

Moreover, here's the worse problem: Not only does Git have the files the way it thinks is right, it also thinks it doesn't need to do any new work with them. If you run git checkout or git add on them right now, Git cleverly notices that the work-tree copies have not been touched and does nothing, even though a re-checkout or re-add would do something different!

The result is that you have to, in effect, trick Git into redoing work. If you need some or all work-tree files updated according to the "from index to work-tree" sequence, you can, for each such file:

  • remove the work-tree copy and run git checkout again, or
  • touch the work-tree copy (so that Git thinks you've modified it) and run git checkout -f to force-overwrite them.

If you need some or all work-tree files updated according to the "from work-tree to index" sequence, you can:

  • touch the work-tree copy (so that Git thinks you've modified it) and run git add, or
  • use the new-ish git add --renormalize to force Git to re-add the files even though it can see that you haven't touched them.

If your Git is too old to have git add --renormalize, you can use the touch method.


1This all holds for core.autocrlf and core.eol as well, but it's almost always best to use the .gitattributes file for finer control here. The Git maintainers do this for Git itself, for instance.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Thank you for your time writing this answer. It might be useful for somebody. But I'm not sure it explains my issue. I managed to reproduce it with the following actions: 1) create a new repo 2) set autocrlf to 'input' 3) commit some file 4) observe its endings are not changed when switching autocrlf to 'false', deleting and checking out the file 5) commit .gitattributes shown above 6) repeat 2 and 4 and observe the endings do change this time – vehsakul Jan 26 '19 at 22:20
  • Try doing what I suggested: `git add --renormalize`. (Note: I don't have Windows myself, I can at best emulate it on MacOS / Linux etc.) If you edit your six steps into a script someone can run, and put that into your question, that will help the someone-else reproduce your issue. – torek Jan 26 '19 at 22:23
  • (Since I am not using Windows, if I try your steps 1-4, at step 4 I get exactly what I would expect: a file that has LF-only line endings, and there are no issues from here on out because the file should and does always have LF-only line endings. See also https://stackoverflow.com/q/39408793/1256452) – torek Jan 26 '19 at 22:31
  • after running `git add --renormalize .` `git status` is `nothing to commit, working tree clean` – vehsakul Jan 26 '19 at 22:38
  • Hm, OK. I also see that you took out the `eol=lf` that you had in the first couple of versions of the question. I assume that you wanted, and had, CRLF line endings initially, because setting `core.autocrlf = input` tells Git to do only the work-tree-to-index-copy line-ending changes. (Other systems would want and have LF-only endings initially.) Setting `eol=lf` in `.gitattributes` also sets the work-tree-to-index conversion, which `git add --renormalize` should obey. Meanwhile `* text=auto` should have *no* effect as `text=auto` is the default. So this seems puzzling. – torek Jan 26 '19 at 22:46
  • I removed what was irrelevant (I checked). It's `* text=auto` that causes trouble. I think it might be a bug. – vehsakul Jan 26 '19 at 23:16
  • Yes, that definitely sounds like a bug. If so, the specific Python version is probably important (I'd recommend adding that to the question). (It's probably also a good idea to check that your files are being saved in UTF-8 rather than UTF-16, but either way that should still not affect the automatic classification.) – torek Jan 27 '19 at 01:22
0

The git documentation is vague about how core.autocrlf and .gitattributes play together:

text

This attribute enables and controls end-of-line normalization. When a text file is normalized, its line endings are converted to LF in the repository. To control what line ending style is used in the working directory, use the eol attribute for a single file and the core.eol configuration variable for all text files. Note that core.autocrlf overrides core.eol

What do they mean by Note that core.autocrlf overrides core.eol if we have the following

core.eol

Sets the line ending type to use in the working directory for files that have the text property set when core.autocrlf is false. Alternatives are lf, crlf and native, which uses the platform’s native line ending. The default value is native. See gitattributes[5] for more information on end-of-line conversion.

This statement says that what I'm seeing is an expected behavior since core.eol is native by default and I have core.autocrlf=false.

UPDATE: This is indeed an expected behavior, core.eol applies only to files with text git attribute set and when autocrlf=false. One can say that with autocrlf=false * text=auto triggers automatic line ending conversion (on Windows with default core.eol the result is the same as autocrlf=true).

The documentation is vague, but it will probably be improved soon.

vehsakul
  • 1,118
  • 1
  • 10
  • 17
  • I do not think it is the expected behavior. After reading your question, I searched the [docs](https://git-scm.com/docs/gitattributes#_checking_out_and_checking_in) myself last night. Your `*text=auto` is a specific option of adding git attributes and should convert your line endings to lf. From what I gather, the only case you would see crlf, is if you had already commited the file with `crlf` before changing your configuration (to `auto.crlf=false`, `*text=auto`) and then you didn't commit any changes after that (only checked out). – tryman Jan 27 '19 at 09:23
  • "One can say that with `autocrlf=false` `* text=auto` triggers automatic line ending conversion". Actually, one can say that `* text=auto` triggers automatic line ending conversion regardless of `core.autocrlf`. It's explicitly there to override `core.autocrlf`. – Edward Thomson Jan 28 '19 at 22:13
  • @EdwardThomson `auto` enables input and output conversion. But for output, it's `core.eol` or `core.autocrlf` (if not false) that determines the separator used. – vehsakul Jan 28 '19 at 22:32
  • You shouldn't use the configuration settings, you should check in your configuration using `.gitattributes`. If you want to set a specific line ending, set the `eol` attribute. – Edward Thomson Jan 29 '19 at 00:05