20

This is a follow-up question from a previous question on Stackoverflow.

I've tried to remove large files from my local git history, but the tool (BFG Repo-Cleaner) suggested in this question states that my private GitHub repository is not a valid git repository.

The command I use is:

java -jar bfg-1.12.12.jar  --strip-blobs-bigger-than 99M https://github.com/name/repo.git

Which eventually results in:

Aborting : https://github.com/name/repo.git is not a valid Git repository.

I couldn't find a solution. Is the tool not compatible with private or https GitHub repositories? How would I use the alternative tool git-filter-branch, to remove all files larger than 99MB from my local git history?

The project is about 6MB large and only about 50 commits were made up to now and no other people are working on it.

Community
  • 1
  • 1
boolean.is.null
  • 831
  • 2
  • 12
  • 19

4 Answers4

19

Point to a local copy, not a remote.

You have given your GitHub URL to the tool, but the usage section on their site says that you should work from a local copy of your repository:

Usage

First clone a fresh copy of your repo, using the --mirror flag:

$ git clone --mirror git://example.com/some-big-repo.git

This is a bare repo, which means your normal files won't be visible, but it is a full copy of the Git database of your repository, and at this point you should make a backup of it to ensure you don't lose anything.

Now you can run the BFG to clean your repository up:

$ java -jar bfg.jar --strip-blobs-bigger-than 100M some-big-repo.git

There is lots of other good content on that page; I recommend you read the entire thing before trying again.

ChrisGPT was on strike
  • 127,765
  • 105
  • 273
  • 257
  • 4
    Thank you for the feedback! How do I find the _name_ like `some-big-repo.git` of the local git repository? Do I point it towards the `.git` directory? – boolean.is.null Jul 18 '16 at 12:28
  • 5
    @boolean.is.null, the name is just the name of the folder where you cloned the repository. By default this will match what is used on the remote (`some-big-repo.git` in this example), but it could be different if you add more options to `git clone` or rename things later. Yes, point to the `.git/` directory. If you're using a non-bare clone I suspect you could also use its root. – ChrisGPT was on strike Jul 18 '16 at 12:49
  • 3
    Cleaning the file out of the repo is already difficult enough. Why would I want to do the work in a mirror? Now I'll just have twice as many problems to deal with. – leerssej Aug 28 '19 at 05:51
  • 1
    @leerssej, because that's how the tool works. The most likely next step is to do a `--force-with-lease` push back to your published repo, though of course that comes with all the inherent challenges with force pushing to a shared repository. Requiring a local repo adds very little complexity, and is consistent with the rest of the ecosystem, where virtually everything you can do happens locally. If you want to modify a remote repo, do the changes on your machine and then push. – ChrisGPT was on strike Aug 28 '19 at 11:07
  • @Andrew, maybe you are confused about the question being asked? OP is asking about a specific error message, caused by trying to run BFG Repo Cleaner against at GitHub URL directly. My answer doesn't _just_ quote the documentation: it points out the error _and_ quotes the documentation as support. – ChrisGPT was on strike Nov 25 '20 at 23:58
  • @Chris but it's perfectly valid to put the GitHub URL in the git clone line. So, again, it's confusing. $ git clone --mirror https://github.com/user/projectname $ cd projectname.git $ java -jar path/bfg.jar --strip-blobs-bigger-than 100M The fact that OP had to follow up in the comments is proof the answer wasn't good enough. – Andrew Nov 26 '20 at 00:10
  • The `cd` is optional. You can give the path to the _local_ repository as an argument when you run BFG, as I have shown, or you can omit it, in which case the current directory is used, which you seem to prefer. In that case, you should be _in_ the repository. But it is _always a local directory_. That was OP's problem, and that is the question that I answered. You might have a different question, which is fine, but it doesn't mean that this answer is wrong. It just answers a different question to yours. – ChrisGPT was on strike Nov 26 '20 at 00:13
  • 1
    @Chris I never said it was wrong. It's just not good enough to help solve the problem. It's incomplete. OP didn't get the answer from it and neither did I and neither did Dave. We had to read your comment on the answer to get there. That suggests to me it should be part of the answer and would make it more complete. – Andrew Nov 26 '20 at 00:27
13

I have fixed this using the following commands:

STEP 1

cd some-big-repo.git

STEP 2

java -jar path/bfg.jar --strip-blobs-bigger-than 100M 

No need to mention repo name explicitly if you are in the repo directory and it will automatically detect repo and do its job.

Mesut Akcan
  • 899
  • 7
  • 19
1

You should remember that with git, all you do on your history must be done locally. And, after that, published by pushing to the remote repository, once you are satisfied.

So, here you have to give the path towards your local repository...

Philippe
  • 28,207
  • 6
  • 54
  • 78
  • 9
    The key phrase here being `local repository` which inside the folder containing the .git folder means I ran: `java -jar bfg -b 100M .git` – Nathan Dortman May 29 '17 at 14:44
  • 1
    Placing the jar in the same folder as the repo and running it like this should be the default reccommendation, I couldn't get it to work any other way. Thank you! – Sebastián Vansteenkiste Jun 13 '19 at 17:26
-1

Very simple just do the following in the command line:

#Change directory to the root of your git repo
cd C:\Projects\MyRepositoryFolder

#Drag the bfg.jar file (or what ever the .jar file name you have) into the repo folder you changed directory to (above). Then once done run
java -jar bfg.jar -b 100M
Egli Becerra
  • 961
  • 1
  • 13
  • 25