2

I'm trying to put a Git project on GitHub but its history contains certain large files. If we try git push to GitHub, we are getting an error:

remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
remote: error: File .OldFiles/blah1/[file].[ext] is 257.29 MB; this exceeds GitHub Enterprise's file size limit of 100.00 MB

Our first commit(say commit_1) was containing a few large files which were removed in one subsequent commit(say commit_2) without rewriting the git commit history.

We are using AFS File System(may be an extra info) and all the old large files are stored at a specific location in .OldFiles directory. In commit_2, we have removed .OldFiles with it's contents and also have added blah1 directory to .gitignore, but this is not removing their history within Git. Unfortunately, we need to keep intact several other commits(literally n number of commits!) after commit_1 & commit_2.

I have tested over a clone at local sandbox by creating a duplicate branch from as commit_1:

git checkout -b fix_branch <commit_1_sha_id>

Found that fix_branch is still containing large files: .OldFiles/blah1/[file].[ext].

Maybe we need to remove these large files in OldFiles & it's respective commit histories to do a successful GitHub push.

Tried this but we are getting an error at git rebase:

error: unrecognised input
error: could not build fake ancestor

Also have tried this but failed:

git filter-branch --force --index-filter \ 'git rm --cached --ignore-unmatch [project]/.OldFiles/blah1/[file].[ext]' \ --prune-empty --tag-name-filter cat -- --all

I'm not sure whether we can use git cherry-pick as we can not discard all files in commit_1 but only these large files.

Is it possible to remove all large files traces by rewriting git history and by editing the commits by using git filter-branch and git rebase -i?

P.S. We do not have lfs or bfg installed in our project space.

A li'l help will be much appreciated to this newbie! :)

Community
  • 1
  • 1
  • I would go back to the first commit, remove the file, amend it and cherry-pick whatever was on top of this branch if the layout of the branch is almost-linear. If the layout is more complex then i think you'll have to use filter-branch to correct it. – eftshift0 Mar 28 '17 at 18:10
  • 1
    This looks like a duplicate of http://stackoverflow.com/questions/2100907/how-to-remove-delete-a-large-file-from-commit-history-in-git-repository and friends. In what way did the `filter-branch` operation fail? – larsks Mar 28 '17 at 18:13
  • @Edmundo, I am not sure how `cherry-pick` may help us. We already have done several commits after `commit_1` and specifically for `commit_1` we need to discard only the large file and keep the other files intact. – Shadow Phoenix Mar 28 '17 at 18:24
  • 1
    Possible duplicate of [How to remove/delete a large file from commit history in Git repository?](http://stackoverflow.com/questions/2100907/how-to-remove-delete-a-large-file-from-commit-history-in-git-repository) – סטנלי גרונן Mar 28 '17 at 18:25
  • That's what I mean. If the branch is rather _linear_, this could be done: ```git checkout commit_1; git rm whateverfile; git commit --amend --no-edit; git cherry-pick commit_1..branch-to-fix;```. Given that on commit_2 (which is the first revision that would be cherry-picked) removed the file and you have already done it in the amended revision, you might get a tree conflict (not completely sure, though). – eftshift0 Mar 28 '17 at 19:02
  • Oh... just noticed I asked to checkout commit_1... it would have to be like this: ```git checkout --detach commit_1;``` everything else is ok – eftshift0 Mar 28 '17 at 19:03
  • You say your attempt to use `filter-branch` failed, but you do not say *how* it failed. Your example shows pointless backslash-space sequences though, so I suspect you typo'd something. – torek Mar 28 '17 at 19:26
  • @torek, Have tried the following: `git checkout -b remove_lfs_fix ;` `git filter-branch --force --index-filter \ 'git rm -rf --cached --ignore-unmatch [project]/.OldFiles/blah1' \ --prune-empty --tag-name-filter cat -- --all;` **OUTPUT: fatal: bad revision ' --prune-empty'**; `git filter-branch --index-filter \ 'git update-index --remove [project]/.OldFiles/blah1/[file].[ext]' ..HEAD;` **OUTPUT: Found nothing to rewrite**; – Shadow Phoenix Mar 30 '17 at 05:58
  • It looks like you are cutting and pasting without understanding: those backslashes are in the other answer mainly for display purposes, they are not meant for you to enter unless you enter the corresponding newlines. That's what is giving you the complaint about `--prune-empty` (complete with leading blank). The second complaint indicates that your `` is at or after `HEAD`, which is not surprising if you are creating a new branch *at* that commit, as in your example code here. – torek Mar 30 '17 at 06:08

1 Answers1

4

BFG was really designed exactly for this case. I did read the "P.S. We do not have lfs or bfg installed in our project space." part of your question, but this still leaves you with two possibilities:

  • Install BFG. If you have Java installed on your machine, it's just a .jar file to download.

  • Use another machine. Using BFG is a one-time operation, you don't need to have it installed on your usual machine, just to have access to a machine where you can run BFG once, do the filtering, and use the resulting repo everywhere.

Roberto Tyley
  • 24,513
  • 11
  • 72
  • 101
Matthieu Moy
  • 15,151
  • 5
  • 38
  • 65