1

My question is identical to this one. Remove large .pack file created by git

I followed all the steps listed here: https://git-scm.com/book/en/v2/Git-Internals-Maintenance-and-Data-Recovery and also tried all the steps listed in this accepted answer. However, the size of the pack file is still large.

Before:

count: 0
size: 0
in-pack: 2259
packs: 1
size-pack: 67333
prune-packable: 0
garbage: 0
size-garbage: 0

After:

count: 0
size: 0
in-pack: 2259
packs: 1
size-pack: 67333
prune-packable: 0
garbage: 0
size-garbage: 0

I can still run this command: git verify-pack -v .git/objects/pack/pack-xxx.idx | sort -k 3 -n | tail -3 and see the three largest files and their corresponding commits but when I run git log --oneline --branches -- <large_file_name>, there are no commits which reference the file, which may be because I rewrote the history of commits. Clearly, I seem to have messed up somewhere along the line.

My question is, how to fix this issue regarding the large .pack file?

jhpratt
  • 6,841
  • 16
  • 40
  • 50
Ram
  • 97
  • 1
  • 10

1 Answers1

4

... when I run git log --oneline --branches -- <large_file_name>, there are no commits which reference the file, which may be because I rewrote the history of commits ...

That's fine (assuming that's your intent). What you need to do now is make sure no other external references reach commits that use the file(s).

Using --branches tells git log or git rev-list1 to look at all branch name references, i.e., everything under refs/heads/. But there may be tag name references under refs/tags/, so you should check there. There may even be other references, so you should check all of them. The easiest way to do that is to use --all rather than --branches: that looks at all references.

But this also misses the reflogs. Every reference has (at least potentially) a reflog. To walk the reflogs, use -g or --walk-reflogs. Note that you must do this separately. If there's a reflog entry that references the commit, you can expire it manually; or you can use the brute-force method of just expiring all reflogs wholesale (which is a little dangerous since the reflogs are your main safety net, but you are doing all this on a copy of the original repository, right? :-) ).

Note that when you use git filter-branch to "rewrite history", you're really copying all of history to a new history. As such, you can temporarily increase the repository size up to about double, depending on what you do in your filters. Removing old reflogs and removing the saved original references under the refs/original/ namespace, followed by garbage collection, should shrink things back to size.

Note also that if a pack file has a corresponding .keep file, Git won't throw out the kept pack even after building a new pack that covers everything. Any .keep files were created manually and must be removed manually if and when that's appropriate.


1These two commands, git log and git rev-list, are actually pretty much just one command, built from one source file, builtin/log.c. They have slightly different entry points, that set up some different default options, and git log will start from HEAD if you don't name any other starting points, while git rev-list demands some starting points.

torek
  • 448,244
  • 59
  • 642
  • 775
  • There's also a plain `--reflog` option to add every reflog entry as if given on the command line, same as `--branches` adds every branch tip. – jthill Dec 28 '17 at 03:48
  • 1
    @torek and @jthill - Thanks a lot for all the information. I finally managed to shrink my repository size. This [checklist] (https://git-scm.com/docs/git-filter-branch#_checklist_for_shrinking_a_repository) helped me figure it out in addition to your comments. To be more specific, I followed the these commands:`git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d`, `git reflog expire --expire=now --all` and finally `git gc --aggressive --prune=now` – Ram Dec 28 '17 at 04:58
  • @jthill: huh, this flag has been in since Git 1.5.0, but wasn't documented until Git 2.2.0 (and I was not aware of it). – torek Dec 28 '17 at 05:42