11

I'm initializing a new Git repo with a huge pile of files. The repo is using Git LFS. I want to ensure that I've told LFS to track all files that should be handled, before I make my first commit.

I see that git lfs ls-files will list all the files that ARE tracked by LFS. However, (a) I want the opposite: all files in the repo that aren't tracked by LFS (and are in .gitignore), and (b) this command only works after you have committed files.

Does anyone have some git-fu or Ubuntu-fu to list all the files in the repo that aren't ignored and aren't matched by the track patterns in the various .gitattribute files Git LFS uses?


The closest I've come is this command that lists files in the repo over 100kB, and then manually scanning all the files and hope that I've them covered by a tracking pattern.

find . -type f -exec du -Sha -t 100000 {} +
Phrogz
  • 296,393
  • 112
  • 651
  • 745

2 Answers2

12

Even though the question concerns files that have not been commited, let me suggest a solution to list the files tracked by git but not git-lfs after commit, which you can do by concatenating the list of files tracked by git (git ls-files) with those tracked by git-lfs (git lfs ls-files | cut -d' ' -f3-) and then only take the files that are unique in this list:

{ git ls-files && git lfs ls-files | cut -d' ' -f3-; } | sort | uniq -u

After which you could edit your commit (git rm --cached and git commit --amend) if you notice a file that has sneaked in...

At the pre-commit stage, proceeding by watching the untracked files list and successively using git lfs track and git add should be quite safe.

Beware, that empty files are not considered LFS objects by the specification, so they will not be listed by git lfs ls-files

Ben
  • 3
  • 2
François
  • 7,988
  • 2
  • 21
  • 17
  • The range in the `cut` command (`-f3`) should be `-f3-` in case there's a space in one of the paths. – Grimeh Nov 02 '17 at 13:36
  • A git-alias version of this that works regardless of where you are in the directory structure of a git repository: ` lfs-untracked = "!_() { ((git ls-files | egrep \"^${GIT_PREFIX}\") && (git lfs ls-files ${GIT_PREFIX:+-I ${GIT_PREFIX}} | cut -d' ' -f3-)) | sort | uniq -u ; }; _"` – crimson-egret Nov 09 '21 at 15:34
  • Sorted by file size: `({ git ls-files && git lfs ls-files | cut -d' ' -f3-; } | sort | uniq -u) | xargs stat -c '%s %n' | numfmt --to=iec | sort -h` – Martin Valgur Aug 15 '22 at 11:14
1

Let me give some ideas:

To get a list of all files in your repository:

find . -type f > all.txt

To get a list of all files that will be tracked by LFS:

set -f; for f in $(cat .gitattributes | cut -d ' ' -f 1); do find . -name $f; done > lfs.txt

To get a list of all files that will NOT be tracked by LFS:

grep -f lfs.txt -F -w -v all.txt > non-lfs.txt