1

There is a mistakenly committed file in our git repository. Firstly, I've found which file causes this by running the following command from the How to find/identify large files/commits in Git history?

$ git rev-list --objects --all \
> | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' \
> | awk '/^blob/ {print substr($0,6)}' \
> | cut --complement --characters=13-40 \
| sort --numeric-sort --key=2 \
> | cut --complement --characters=13-40 \
> | numfmt --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest

Which results 6b82d8f18acd 716MiB MSSender/DebContainer.tar.tgz

Then, I used git filter-branch --tree-filter "rm -f DebContainer.tar.tgz" HEAD --all command from https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History#The-Nuclear-Option:-filter-branch to delete this file in all branches. While processing it creates a gitrewrite folder and creates .git folder back with the same size? What am I possibly doing wrong? Thank you.

Oguz Ozcan
  • 1,497
  • 13
  • 21
  • You say that you found that the faulty file was `MSSender/DebContainer.tar.tgz`, but you're trying to remove the file `DebContainer.tar.tgz`... – zigarn Dec 20 '17 at 08:46
  • Yes, I also tried to delete `MSSender/DebContainer.tar.tgz` later but no luck. – Oguz Ozcan Dec 20 '17 at 08:46
  • Also tried to delete *.tgz – Oguz Ozcan Dec 20 '17 at 08:47
  • Is the file still visible in history? (if your question is just about the `.git` not reducing in size: https://stackoverflow.com/questions/44760499/dropping-a-commit-in-git-rebase-i-does-not-reduce-the-size-of-git-folder/44760763#44760763) – zigarn Dec 20 '17 at 08:51
  • The file is pushed, then deleted in local branch and pushed again to dev branch (applying git-flow). The file only exists in the `.git` folder as object. So, my question about reducing size of `.git` folder – Oguz Ozcan Dec 20 '17 at 08:54
  • `filter-branch` is for rewriting the history. Using git-flow is not applicable for this kind of manipulation. If any commit in the commit history on any branch still contains the file, it will never be deleted from the repository. After the `filter-branch` execution (with `--tag-name-filter cat`), remove the 'refs/original' references, then run `git -c gc.reflogExpire=now gc --prune=all`, then the blob should be removed from your `.git` folder and you can force-push all the rewritten references. – zigarn Dec 20 '17 at 09:02
  • When I run your suggested command I get the following error: $ git -c gc.reflogExpire=now gc --prune=all Counting objects: 9326, done. Delta compression using up to 4 threads. Compressing objects: 100% (3056/3056), done. Writing objects: 100% (9326/9326), done. Total 9326 (delta 4499), reused 9326 (delta 4499) Unlink of file '.git/objects/pack/pack-0ed865354d1ce70f9eeab891f47af8c8cf34809c.pack' failed. Should I try again? (y/n) y Unlink of file '.git/objects/pack/pack-0ed865354d1ce70f9eeab891f47af8c8cf34809c.pack' failed. Should I try again? (y/n) – Oguz Ozcan Dec 20 '17 at 09:06
  • Looks like you have some other process accessing the file (https://stackoverflow.com/questions/4389833/unlink-of-file-failed-should-i-try-again) – zigarn Dec 20 '17 at 09:12
  • Yes, I've overcome that and now the folder size is decreased. Thank you so much. (If you answer to question I will accept and upvote it so you get points :)) I also have some question about branches and pushing. Can I directly push my branch to the remote? How about merging my branch to dev branch. – Oguz Ozcan Dec 20 '17 at 10:18
  • If the file was only in your branch and the rewritten history does not impact the `dev` branch, then simply force-push your branch, then merge into `dev`. Otherwise, the `dev` history was rewriten and also need to be force-pushed. – zigarn Dec 20 '17 at 13:03

1 Answers1

3

The procedure for shrinking the repository size after rewriting the history is documented in git filter-branch documentation.

The removed file is still accessible by some references:

  • references backups created by the filter-branch
  • reflog references

So, to shrink the .git folder, you have to get rid of those references:

  • either by creating a new clone from the rewritten one
  • either by deleting the references and garbage-collecting the repo content:

    git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
    git -c gc.reflogExpire=now gc --prune=all
    

NOTE:
Any modified reference need to be force-pushed to the original repository.
And anyone that cloned this repository need to carefully update it's repository (git pull --rebase for each local branch should be the best option)

zigarn
  • 10,892
  • 2
  • 31
  • 45