0

I was currently using the Bitbucket since it has 2GB repo size and lets you have private repo for free. Since the limit hits at 2GB and tried to reduce down the file size using BFG, along with its tutorial, to reduce the repo size and removes previous history. Sadly, I'm half way through and never happens after I pushed it onto the repo. My size reduced from 2GB to around 500 to 600MB of repo size from "name.git" folder while my project folder from Unity is around 1GB.

I tried to create a new repo instead out of my frustration, copy-paste the project and used the LFS installed to directly put all big files known for some specific file type. Once I pushed it, it took longer than usual. I'm having second thoughts if it is really take really long or probably it stays freeze, assuming there is some missing requirements. That LFS provided additional 10GB storage (not sure if its per repo or not) and since it was suppose there to store big files I believe.

What should I do in order to have a peace of mind, ensuring successful online back-up while maintaining the version control as such? I temporarily created a Google Drive (not the best way to back-up big project files) to store my big files back-up online while resolving the GIT issues. Bitbucket might be best for small game projects (2D or 3D) and stuff but what other alternatives aside GitHub, Bitbucket, and SVN? The free git repo website contains really big repo size storage?

Also, about LFS, should I have to use BFG first before proceeding tracking of file types known for having big file sizes in groups (e.g. .psd, .mp3, .dae...)?

1 Answers1

1

You can resized your repo by git command line ,steps as below:

  1. Clone the repo to local git clone <URL>
  2. List size of files and then you can decide which need to delete, use find . -type f -printf "%s\t%p\n" | sort -nr | head -n 100
  3. Clean for histories and save invalid space, you can use below commands:

git filter-branch --tag-name-filter cat --index-filter 'git rm -r --cached --ignore-unmatch filename' --prune-empty -f -- --all

rm -rf .git/refs/original/

git reflog expire --expire=now –all

git gc --prune=now

git gc --aggressive --prune=now

  1. Push the resized local repo to remote, use git push origin --force –all and git push origin --force –tags

Note: if you work on OSX OS, you can also refer Steve Lorek's blog

Because GitHub and Bitbucket is the different remote for git, so the difference among GitHub, Bitbucket and SVN is mainly the difference of version control system(VCS) between GIT and SVN, you can refer here.

For GitHub and Bitbucket, as you know, Bitbucket can create private repo for free (if your repo is used less than 5 users) but GitHub need to pay for private repo. So I always create public repositories on GitHub and private repositories on Bitbucket.

Part II

The other situation, in \ .git\objects\ directory, there both have loose objects and packfiles, so we can convert all loose objects to packfiles and then deal with pack file.

  1. Covert all loose objects into packfiles git gc and check pack size git count-objects -v
  2. Show .idx packfiles find .git/objects/pack, assume the name is pack-8319f98bb0c73a6f3f15905772e8743bf2d28dfd.idx
  3. List top 3 biggest objects git verify-pack -v .git/objects/pack/ pack-8319f98bb0c73a6f3f15905772e8743bf2d28dfd.idx |sort –k 3 –n | tail -3 and then copy the SHA-1 value for the biggest objects, assume the SHA-1 is ac78de3
  4. Find the biggest file git rev-list --objects --all | grep ac78de3, assume the biggest file is src.zip
  5. Find commit histories which changed the biggest file git log --oneline --branches – src.zip, copy the earliest commit id, assume it’s a50197e
  6. Remove the file from the earliest changed commit histories git filter-branch --index-filter 'git rm --ignore-unmatch --cached git.tgz' -- a50197e ^..
  7. Remove useless reference rm -Rf .git/refs/original, rm -Rf .git/logs git gc and git prune --expire now
  8. Review the pack size git count-objects -v
Community
  • 1
  • 1
Marina Liu
  • 36,876
  • 5
  • 61
  • 74
  • I'm at step 3, `git filter-branch part` part. While processing, I saw there's a ".git-rewwrite" folder. Should I add and commit and push that folder or later on? – Tredecies Nocturne Nov 23 '16 at 10:46
  • @Tredecies Nocturne, you can try **git filter-branch --tree-filter 'rm -f filename' HEAD** instead – Marina Liu Nov 23 '16 at 11:39
  • I see. I got this error when attempting to push repo first at step 4. ('error: src refspec –all does not match any.') – Tredecies Nocturne Nov 23 '16 at 12:10
  • what about `git push -f origin/`? – Marina Liu Nov 23 '16 at 12:25
  • I tried this: `git push -f origin/\"shooter.git\"` where "shooter.git" is my repo name. I got this error message: `fatal: origin/shooter.git does not appear to be a git repository` – Tredecies Nocturne Nov 24 '16 at 01:31
  • sorry for the misunderstanding, you should use branch name after `origin/', such as 'git push -f origin/master' or you can try 'git push --all' – Marina Liu Nov 24 '16 at 01:41
  • Tried. The result is " `Everything up-to-date` " and the repo size at Bitbucket.org is still over 2GB – Tredecies Nocturne Nov 24 '16 at 02:13
  • When I used from the step 3, I entered the command line 'git reflog expire --expire=now –all' and receives an error. I tried last 3 command lines altogether and it's done. But, I feel the repo size still not reduced yet. – Tredecies Nocturne Nov 24 '16 at 02:15
  • what's the size of the file you removed? As my experience, these steps can be reduce the size successfully. and can you execute the steps above one after another without other operations? – Marina Liu Nov 24 '16 at 02:22
  • Hmm...I'll check. My current repo size is 2GB. I wanna reduce at least half of it. By the way, deleting the file the normal way before proceeding the above steps, will it help too or better leave no changes so that it will work? – Tredecies Nocturne Nov 24 '16 at 02:35
  • Yes, the biggest file can be removed by the first command in step3. It will remove the file from all the history commits. And just kindly reminder, if the biggest file exists in more than one branch, you should execute these steps in the branches. – Marina Liu Nov 24 '16 at 02:40
  • I was planning to remove the file manually via the DELETE key from the keyboard one-by-one selectively before proceeding the 4-step repo reduction. – Tredecies Nocturne Nov 24 '16 at 02:44
  • please try to use **git filter-branch --tree-filter 'rm -f filename' HEAD** or **git filter-branch --tag-name-filter cat --index-filter 'git rm -r --cached --ignore-unmatch filename' --prune-empty -f -- --all** instead. – Marina Liu Nov 24 '16 at 02:51
  • Ok. Got it. I'll re-run it. – Tredecies Nocturne Nov 24 '16 at 02:52
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/128893/discussion-between-marina-msft-and-tredecies-nocturne). – Marina Liu Nov 24 '16 at 03:12
  • @TredeciesNocturne, hi, have you tried it successfully? If yes, please mark it as answer since it will help others who have similar questions:) – Marina Liu Nov 28 '16 at 06:40
  • Sadly, still won't work yet. But, I'll give you an upvote instead for this valuable answer and as a reference. I guess it won't work if my git repo from Bitbucket have reached to the HARD LIMIT. Even if I delete some files locally (w/o using BFG), still unable to push some changes. Best if the repo is still haven't reached in the limit. I'm sorry. I tried the best I can to follow and using your advice. – Tredecies Nocturne Dec 06 '16 at 10:55
  • Sorry to hear that too. But I found the other situation, you can convert to packfiles in `/.git/objects/pack` folder and then rewrite histories. I've added the situation in my answer as **Part II**, hope it's helpful. – Marina Liu Dec 07 '16 at 07:23