9

I have git repo, I have accidentally committed some library files to the git remote repo.

Now it has resulted in increased size of about 6.23 GB. Then I tried deleting the library using the following commands

git filter-branch -f --index-filter "git rm -rf --cached --ignore-unmatch node_modules" -- --all

rm -rf .git/refs/original/
git reflog expire --expire=now --all
git gc --prune=now
git gc --aggressive --prune=now

Now the library has been removed from the repository and it is not listed in the repo folders. But still the size of the local repo is bigger as before

One more thing is it takes a lot of time to execute the above commands. I am not sure whether they worked properly

I even did try pushing this to remote repo,

git push --all --force

but that doesnot get pushed successfully,It tries till the last and suddenly comes as the remote repo is not reachable or not responding of that sort

I also tried rewritting the tags

git filter-branch -f \ --index-filter 'git rm -r --cached --ignore-unmatch node_modules' \  --tag-name-filter 'cat' -- --all

I also tried the following to make it work

git config --global pack.windowMemory 0
git config --global pack.packSizeLimit 0
git config --global pack.threads "3"

But whatever I do the size of the repo is still the same

Note: I tried

git fsck --full --unreachable

There are several tags listed that are not reachable

keerthee
  • 812
  • 4
  • 17
  • 39
  • Only commits that *cannot* be accessed - except by hash - are pruned (ie. check the repo again with git log and path). Also, were the files of significant size to notice? – user2864740 Jul 09 '15 at 05:07
  • Yeah,I checked with git log -- node_modules, I get no output. Actually the files are not that much significant size, but there are several that the node_modules folder is about 30MB – keerthee Jul 09 '15 at 05:16

2 Answers2

4

I mention in Git pull error: unable to create temporary sha1 filename that git gc alone isn't enough.

One combination which should bring down the size of the repo should be:

git gc
git repack -Ad      # kills in-pack garbage
git prune           # kills loose garbage

However, This must follow any cleanup of big file (git filter-branch), and that is only for the local repo.

After pushing (git push --force) to the remote repo, said remote repo won't benefit from the same size reduction. A gc/repack/prune needs to be done on the remote side as well.

And if that remote side is TFS... this isn't easy/possible to do for now.

Community
  • 1
  • 1
VonC
  • 1,262,500
  • 529
  • 4,410
  • 5,250
  • I am unable to repack "Unlink of file '.git/objects/pack/pack-36a49c503c404065dd29b6802740456aac25fa2f. pack' failed. Should I try again? (y/n) y" is the message I get – keerthee Jul 09 '15 at 09:09
  • Actually the size has increased now – keerthee Jul 09 '15 at 09:09
  • @keerthee do you have another proces using that file? (as in http://stackoverflow.com/a/6076796/6309 or http://stackoverflow.com/q/25138946/6309 or http://stackoverflow.com/a/10181965/6309) – VonC Jul 09 '15 at 09:13
  • I saw that link. But I dont think that file is being used. I will close applications and try repack again – keerthee Jul 09 '15 at 09:16
  • I get the same thing again, I tried to get the handle using the pack object form process explorer, the process owning that is git-upload-pack.exe, but I am unable to kill this process – keerthee Jul 09 '15 at 10:59
  • @keerthee try and reboot, if possible – VonC Jul 09 '15 at 11:00
  • VonC, I tried the repack and prune, yet it is still the same size – keerthee Jul 09 '15 at 13:35
  • No luck, it is still the same size – keerthee Jul 10 '15 at 07:18
  • @keerthee What would happen if, after that gc-repack and prune, you try to clone locally that repo? Would the clone be smaller? – VonC Jul 10 '15 at 07:20
  • All the above operations are made in the local repo only, it has already been cloned from remote, also not pushed to the remote repo. The thing is the local repo is itself the same size – keerthee Jul 10 '15 at 07:24
  • @keerthee I understand that. I am asking: what if you clone the local repo where you did the git-repack and prune: would the local clone of the local repo be smaller then? – VonC Jul 10 '15 at 07:25
  • I tried doing it. but looks like it is cloning the remote repo $ git clone --local file:///C:/Users//Documents/SourceCode warning: --local is ignored Cloning into 'Code'... warning: no threads support, ignoring pack.threads remote: Counting objects: 6906, done. remote: Compressing objects: 100% (3112/3112), done. Receiving objects: 100% (6906/6906), 5.88 GiB | 8.84 MiB/s, done. remote: Total 6906 (delta 4011), reused 6127 (delta 3647) Resolving deltas: 100% (4011/4011), done. Checking connectivity... done. Checking out files: 100% (1428/1428), done. – keerthee Jul 10 '15 at 10:56
  • @keerthee don't use the `--local` option or `file:///`: if your repo is in folder `a`, what happen if you clone it in `b`? `git clone a b`. – VonC Jul 10 '15 at 10:58
  • I tried it, but still the size remains the same $ git clone SourceCode SCCopy Cloning into 'SCCopy'... done. Checking out files: 100% (1428/1428), done. – keerthee Jul 10 '15 at 11:02
  • @keerthee when you select "Properties" on the destination folder, the size is still 6GB+? If that is the case, I can only assume the `git filter-branch` didn't work correctly. – VonC Jul 10 '15 at 11:03
  • Yeah thats the case, one thing I missed out to say, I have cloned only one branch (master) to the local repo, still there are branches in remote, Will that cause any issues – keerthee Jul 10 '15 at 11:07
  • @keerthee no, that shouldn't affect the filter branch, the gc-repack-prune or the clone. I would double-check if the filter-branch worked or not. – VonC Jul 10 '15 at 11:08
  • One more thing, I find lots of unreachable commits and blobs listing when I run git fsck --full --unreachable – keerthee Jul 10 '15 at 12:08
1

I am sorry folks!!! In addition to node_module there was another main culprit ntvs_analysis.dat that caused the huge size of the repo. Now I have removed that and now the repo size is 75 MB. Even the steps that I have followed itself was working

keerthee
  • 812
  • 4
  • 17
  • 39
  • 1
    Thank you for the feedback. +1. I still think my answer can help ;) – VonC Jul 24 '15 at 13:47
  • I have got one more problem. Now the cloud size remains the same, even I did git push --all --force – keerthee Jul 28 '15 at 13:56
  • I would recommend the gc/repack/prune sequence on the bare repo on the cloud then. – VonC Jul 28 '15 at 14:00
  • Is bare repo a necessary thing, I have the cloned version of the cloud repo in local,I dont have a bare repo in local – keerthee Jul 28 '15 at 14:14
  • @VonC Looks like the deleted file still exists in remote git,I cloned the remote repo to another folder,there the .ntvs_analysis.dat still exists with same size. Are there anyways to make sure the rewritten commits are pushed properly to remote – keerthee Jul 29 '15 at 09:10
  • That is why I mentioned in my previous comment "on the cloud" (when you clone locally, the repo is generally not bare, and that is ok). Did you do a `git push --force` after removing the big files from the history? – VonC Jul 29 '15 at 09:12
  • Then you need to contact the remote server admins for them to do a gc/repac/prune on the bare repo server side. – VonC Jul 29 '15 at 09:14
  • OMG... Okay https://connect.microsoft.com/VisualStudio/feedback/details/1019193/unable-to-clean-a-git-repo-in-tfs Looks like TFS doesnt have such feature – keerthee Jul 29 '15 at 09:17
  • I have got one more problem now, I created a new repo and pushed the cleaned repo there,but only master branch was available there.So I cloned the old repo again to get other branches.The size is still the same,but when I try to do the above cleanup there are no changes, the size remains same – keerthee Aug 04 '15 at 12:54
  • From the old repo, could you push all branches (http://stackoverflow.com/a/4886153/6309) to your new remote repo? – VonC Aug 04 '15 at 13:47
  • Actually I deleted the cleaned local repo and now clone the old repo and trying to clean it up,But neither of the above steps works out – keerthee Aug 04 '15 at 13:56
  • Yes, but maybe pushing all branches will not make the remote repo bloated. Try to backup the remote repo first, and then try and push all branches, to see if the size is too big. – VonC Aug 04 '15 at 13:57
  • Actually in the local git is itself,it is big – keerthee Aug 04 '15 at 14:01
  • You could try and push branch by branch, to see the effect on the siez on the remote repo. – VonC Aug 04 '15 at 14:02
  • Hi,VonC,Actually I guess I havent described my problem properly. I deleted the cleaned repo from local.Now I have cloned the old repo from the remote again.Its size is now as bigger as before(6GB) in local too.I try to clean it up now (i.e) delete the file with huge size,it doesnt exist now (as I deleted previously itself and pushed to the old remote repo).So those files are somewhere in the git.I dont know how to delete it – keerthee Aug 04 '15 at 14:08