0

There are several files ( mostly pngs ) amounting to around 5 GB in my Git repository. These files are spread across different directories. So, basically I just needed to delete those directories. Eg.

dataset1/ #contains around 1 G of pngs
dataset2/ #contains around 1 G of pngs
library1/ #contains around 3 G of .so

I have now deleted them, commited and pushed again. But ofcourse, if I clone the repository again, since they once were checked in, become part of the checkout. I can confirm, that they are being checked out because I can see the number of objects and it is a huge number ( 52768 ). After deleting the 5 GB, I was expecting around 3000 objects.

How can I permanently delete them from the upstream as well, so that they dont appear in the clone anymore?

infoclogged
  • 3,641
  • 5
  • 32
  • 53
  • the possible duplicate is for a single file, my problem is with multiple directories. – infoclogged Dec 13 '17 at 11:39
  • I'm sure you can figure out how to substitute those directories into any examples given there... but an alternative duplicate that specifically mentions folders: [Remove folder and its contents from git/GitHub's history](https://stackoverflow.com/questions/10067848/remove-folder-and-its-contents-from-git-githubs-history) – underscore_d Dec 13 '17 at 12:00
  • @underscore_d : yes, the new link addresses my problem. Guess, SO is about giving direct answers and not figuring out the solution yourself, as mentioned by you in your comment. I hope you get my point. – infoclogged Dec 13 '17 at 12:07

1 Answers1

0

This is where the BFG is useful. It will help you delete large files from your history.

Take special care to note the last step:

At this point, you're ready for everyone to ditch their old copies of the repo and do fresh clones of the nice, new pristine data. It's best to delete all old clones, as they'll have dirty history that you don't want to risk pushing back into your newly cleaned repo.

Edmund Dipple
  • 2,244
  • 17
  • 12