1

Summary: I made a git commit that contained oversized files and, when trying to push, got the dreaded "large files" error. I restructured the repo to have a new top-level directory that no longer contains any large files, but I still get the "large files" error when trying to push. I tried various common solutions (below), but git keeps trying to push files that are outside the new top-level repo.

Details on what I did:

  1. I manually removed the .git and .gitignore files to my desired new directory, as described here.

  2. I confirmed that the new root directory was successfully recognized via git rev-parse --show-toplevel.

  3. I tried to push to the remote again (git push origin main), but got the error File <filepath> is 102.90 MB; this exceeds GitHub's file size limit of 100.00 MB, where <filepath> is a path inside the old directory, not the new one.

  4. I tried to remove the file from the cache via git rm -r --cached <filepath> (as described in the accepted answer here), but this yields the error fatal: <filepath> is outside repository.

  5. I reset via git reset HEAD~, then tried again to push, but I got the same error as above.

  6. I tried to filter the branch history to remove commits involving the large file (stitched.csv) via git filter-branch --index-filter 'git rm -r --cached --ignore-unmatch stitched.csv' HEAD, as described here. Then I tried to push again and still got same error, again referring to stitched.csv.

In practice, I have quite a few oversized files, so I would really rather not have to remove each one from the cache manually. I have made numerous good commits since the ones that involved large files.

Any help would be much appreciated.

half-pass
  • 1,851
  • 4
  • 22
  • 33
  • The `git push` command pushes *commits*. Every commit has a full snapshot of every file. This is not a cache: this is the fundamental way that Git works. If some commit has a large file, it has that file, forever, because no commit can ever change. If you don't like that commit, you can stop using it—and all its descendants—and avoid `git push`-ing that commit (note that an attempt to push one of its descendants will take *that* commit too, which is why you have to ditch all the descendants). – torek Aug 25 '21 at 21:32
  • There are some tools specifically for rewriting repositories to ditch large files: The BFG and the newfangled `git filter-repo` both have support for this. Filter-repo is the replacement for the now-deprecated, hard-to-use `filter-branch`, but filter-repo isn't actually included in Git distributions: you have to fetch and install it. (Same for The BFG.) – torek Aug 25 '21 at 21:33

1 Answers1

1

As commented, you need to filter and remove those large files from your Git history.

The more recent option is now the third-party tool git filter-repo (with its installation process, and using Python)

In order to not have to list every large file, you can determine a size above which you want any file to be removed:

git filter-repo --strip-blobs-bigger-than 2M

Replace "2M" (two Mo) by an appropriate size: see "How to find the N largest files in a git repository?".

VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250