Following the documentation in gitlab doc: reducing_the_repo_size_using_git I'm cleaning a repo from GitLab, so I exported it, got a link by email, downloaded and untared it.
aaa_export$ git clone --bare --mirror project.bundle
Cloning into bare repository 'project.git'...
Receiving objects: 100% (109830/109830), 627.15 MiB | 63.75 MiB/s, done.
Resolving deltas: 100% (89023/89023), done.
aaa_export$
$ du -sh project.git
633M
$
Then I Cleanup unnecessary files and optimize the local repository:
$ git gc --prune=now --aggressive
Enumerating objects: 109830, done.
Counting objects: 100% (109830/109830), done.
Delta compression using up to 4 threads
Compressing objects: 100% (108121/108121), done.
Writing objects: 100% (109830/109830), done.
Selecting bitmap commits: 13458, done.
Building bitmaps: 100% (238/238), done.
Total 109830 (delta 89020), reused 19482 (delta 0)
$
$ du -sh project.git
633M # Not a surprise this working copy came from a gitlab export
$
By curiosity I look for the biggest blob in my repo:
$ git verify-pack -v objects/pack/*idx |sort -n -k3 |tail -3
24c41d1b2132daac9a13910f839173da3890c991 blob 13464592 8520894 149667646
28678d4814faecf8c20a3c893e1ac93cd159a289 blob 19558229 19538291 167335758
8103683624212caadee8e609295addd24ec43db1 blob 21805631 15702989 237885293
$
And so I try to got commit objects list from the bigger blob:
$ git cat-file -t 810368362
blob
$
$ git rev-list --objects --all | grep 8103683624
$
Same when using a git whatchanged
as exposed in Which commit has this blob?
$ git whatchanged --all --find-object=8103683624
$
So I'm very surprise to have this big blob that correspond to nothing:
SHA-1 type size size-in-packfile offset-in-packfile
8103683624212caadee8e609295addd24ec43db1 blob 21805631 15702989 237885293
How can I know what is this blob for ?
Note: We are here in a bare repository so git log diff and describe does not apply that's why which-commit-has-this-blob/#VonC with the git log --find-object=<object-id>
does not apply here, btw the which-commit-has-this-blob question is around create a central Git repository not about understanding how a big blog is refering to no commit/tree