2

I have a repository containing 5 files that have been committed with CRLF. I don't know how this happened, but on a clean checkout if I use this command it prints 5 files (out of hundreds):

git grep -I --files-with-matches --perl-regexp '\r' HEAD

Does anyone know how can I reproduce this issue? In other words what is a set of git settings that can lead to this situation?

Marinos An
  • 9,481
  • 6
  • 63
  • 96

2 Answers2

4

Internally, Git just stores raw data. If you run git hash-object -w you can push any blob data you like into the repository (though you would then need to attach a tag, or add the blob to the index to get it stored into a new commit).

As I noted in my answer to What does "check out code" mean in git documentation for line endings?, Git will apply CRLF-to-LF-only line-endings translation on any file on which such translations are enabled, at the time you run git add on that file. The result is that the version of the file in the index (or more precisely, the blob hash in the index, representing the in-repo blob object) has LF-only line endings.

If you run git add on that file with:

  • translations disabled globally, or
  • translations disabled on that particular path name

then Git won't do those translations, and the index version of the file will have any '\r' characters it had in the work-tree version.

The settings in .gitattributes and/or core.autocrlf control whether translations are enabled, and if so, which translations to perform. Due to historical settings (from back when Git did nothing at all, to the early stages of adding Windows support, through various intermediate versions of Git, to the current rather complicated .gitattributes method) the rules for all of this are quite complicated.

In other words what is a set of git settings that can lead to this situation?

There are many different ways to do it, but the one that's the simplest by far is to write a .gitattributes file with just:

* -text

or to set core.autocrlf to false (but note that .gitattributes overrides core.autocrlf, in general). Now Git will treat all files as binary, doing no "cleaning" during git add and no "smudging" during git checkout. The work-tree contents will now match the index contents byte-for-byte, except for any changes you make yourself, or make by running programs, to work-tree files. You can then git add those new files to the index and it will copy them in, byte-for-byte; and each new git commit you make will use what's in the index.

Once you have stored, as permanent and unchangeable commits, the particular versions of particular files you care about, you can modify .gitattributes to contain any other settings you would like to test, and run git checkout <commit> -- <path> to make Git copy the file from a commit, to the index, through the smudging filters, and into the work-tree. You can modify any work-tree file any way you like, then run git add <path> to run the file through the cleaning filters to copy it into the index. These filters will be controlled by whatever you have in .gitattributes at the time you run the commands, so you can experiment with different attributes without having to make new commits.

torek
  • 448,244
  • 59
  • 642
  • 775
  • Could the above lead to the conclusion that there is no way (using standard git client) to perform a commit that includes CRLF, if `core.autocrlf`=true (windows) / `core.autocrlf=input` (linux), and no .gitattributes are present? – Marinos An Oct 12 '17 at 10:13
  • @MarinosAn: no, because if you start with a commit that has CRLF endings in some file and extract that file from the commit into the index, the index version of the file has CRLF endings. If you then never `git add` the file back into the index, the index version continues to have CRLF endings, and the next commit uses the index version. At this point *no setting matters:* what goes into the next commit is the same as what was in the previous commit. – torek Oct 12 '17 at 14:54
  • both your answers really cleared a lot of things. Also helped me find out how some files can become so persistently unrevertable in git: https://stackoverflow.com/a/45030792/1555615 – Marinos An Oct 12 '17 at 17:00
0

You're probably using git config --global core.autocrlf true

For a better explanation look at the docs.

Bernardo Duarte
  • 4,074
  • 4
  • 19
  • 34