Summary:
This is most likely due to a buglet in Git-LFS, the fix (or at least workaround) for which is in a slightly newer version of Git-LFS.
See https://github.com/git-lfs/git-lfs/issues/1880 and https://github.com/git-lfs/git-lfs/pull/1932 for a a description of the bug and fix/work-around.
Here's my capsule summary though:
When you use Git-LFS, your standard Git configuration is modified to use Git's clean and smudge filtering processes. A smudge filter is meant to "dirty up" a file as it comes out of the version control system, and a clean filter is its logical opposite: it cleans up the file for storage inside the VCS. To use such filters, you must mark up both your .git/config
file and your .gitattributes
file; Git-LFS does this marking-up for you automatically.
As a somewhat silly example, some people like to have each line end with CR-LF instead of just a newline (LF-only), yet store the version-controlled version of the file with newline endings. Git can do this directly (using end-of-line filters) but if for some reason you wanted to write a program to do it yourself, you would use the smudge filter to replace LF-only with CR-LF, and use the clean filter to replace CR-LF with LF-only. Your work-tree files would then have your desired Windows-style line endings, while your committed files would have the desired Linux-style line endings.
More practically, some people like to expand keywords (RCS- or CVS-like $Id$
for instance). Git also includes a built-in ident
filter that does this specifically for $Id$
. The expansion is done when extracting files to the work-tree (as if by a smudge filter), and removed—put back to just $Id$
—when adding files to go into new commits.1 If you wanted to handle more keywords, e.g., $Log$
,2 you would have to write your own filters.
What Git-LFS does is to use (abuse?) the clean and smudge filters to replace large files with specially decorated hash IDs that act as "pointers" to external large-file storage. This way, Git never stores—or even sees, in a sense—the large files at all. It sees, and stores, only these decorated hashes. The Git-LFS filters are responsible for replacing a funny hash with the actual large-file contents on checkout, and replacing the large-file contents with a funny hash (new or re-used as appropriate) at git add
time.
But there is a technical glitch: Git uses pipes to implement both smudge and clean filters. (These pipes have become quite fancy recently; see the Long Running Filter Process section of the gitattributes documentation.) The Git-LFS code and the Git code itself must be careful not to "constipate" the pipes ... and, well, it wasn't. (See my answer to live output from subprocess command for some of the details.)
1Specifically, smudge filters (along with ident and any end-of-line hacking) are applied to files when they are copied from Git's index to the work-tree. Clean filters are applied to files when they are copied from work-tree into the index. Since committed trees are built only from files as found in the index, and the index version is always "clean", all commits are always clean as well.
2What $Log$
does in RCS and CVS is to expand to the entire log-message history of the file—essentially, all the commits that touched the file, except that RCS and CVS are file-based rather than commit-based—so that the history is there while you are editing the file itself. Having actually used such things before, I am firmly convinced that this is a bad idea: this kind of metadata belongs only in the version control system (which, like Git or Mercurial, should be distributed with developers given access to the entire repository). Nonetheless, there are people who like this, and it is technically possible to do it in Git.