3

I have a local-only repository which used to contain very large files (scans and some DB files). At some point I decided I'd remove the directory containing all those, and rewrite the history to eradicate the directory in question.

I ended up with a very light repo, but the .git directory still takes 1.3G of space. There is only one pack, and I identified at least one blob in this pack taking a lot of space.

I tried a lot of things to clean up the repo, including various summonings of git gc, but nothing works, not even git forget-blob. Git forget-blob tells me « not found in the repo history ».

At that point, I'm lost. Any help appreciated.

Thanks!

EDIT: some additional information which I find very bizarre. Git verify-pack shows me 3 very big files:

git verify-pack -v .git/objects/pack/pack-5cc03e9fbdbdff4ce1bbeb43c55c3e17875f2bd7.idx| sort -k 3 -n | tail -3
4983118ae60be35299b153dc5850134329f6ddf0 blob   7336960 2000979 615935480
5c810dfffa6a033631596218c43a7360cf2aff10 blob   12455669 1197771 6330554
25012927d95cf3bd15f2a8cb30da2c4f4b988e82 blob   105476096 83834099 532101381

However, I cannot get any information on those blobs. How's that possible?

git rev-list --objects --all  | grep 250129
zsh: done       git rev-list --objects --all | 
zsh: exit 1     grep --color 250129
Didier Verna
  • 131
  • 2
  • `forget-blob` is not a standard Git command, so it must be something you installed (perhaps from [here](http://stackoverflow.com/a/41801085/1256452)?). If so, the linked script appears to have a bug, but the bug should leave the blob in there so that a second run of `git forget-blob` also pretends to remove it, in the case I'm thinking of. – torek Mar 20 '17 at 15:30
  • I don't think it's a bug in the script. I have edited my original question with additional information. – Didier Verna Mar 20 '17 at 15:40
  • Ah, so you deleted the file using something other than the forget-blob script originally? It would help a lot if you included *exact commands* and their output (cut and paste the text, not screen-shots, if at all possible). I'm guessing now that the object is in fact not reachable in the pack, but without running `git repack -A -d` Git will not rebuild the pack to discard the unreachable object. – torek Mar 20 '17 at 15:42
  • Thank you! I'll try that. Your comment is not on the right question. Care to answer on https://tex.stackexchange.com/questions/566539/is-the-transparent-shadows-hack-for-beamer-blocks-broken ? If you can't for some reason, I'll propagate your answer myself. – Didier Verna Oct 13 '20 at 09:15

1 Answers1

0

There are a number of things that have to be considered:

  • Any revision from reflog is pointing to any of the previous revisions (before the rewrite)?
  • Any stash object is pointing to any of the old revisions?
  • Any remote branch is poiting to any of the old revisions?

In order to convince git to delete an object, there should be no pointers to it. Another thing to consider is that objects are saved in "packs". I remember that I once had to "explode" all the pack files I had (that is, get git to put all objects on the FS), then deleted the pack files and then I asked git to repack.

https://git-scm.com/book/en/v2/Git-Internals-Maintenance-and-Data-Recovery

Check the part about "Removing objects". Hope that's good enough.

eftshift0
  • 26,375
  • 3
  • 36
  • 60