I have a git repository that, when just checked out, takes around 2.3 GiB even in the shallowest configuration, of which 1.9 GiB is inside .git/objects/pack
. The working tree files are just about .5 GiB.
Considering I have a remote from which I can re-fetch all the objects if needed, the question is:
- What (and how) can I delete from inside
.git
everything that I could then re-fetch safely, with simple git commands, from the remote?
Testing a bit, I found out that if I delete everything under .git/objects/pack/
, it will be re-downloaded from the remote with a simple git fetch
.
There are some complaints like:
error: refs/heads/master does not point to a valid object!
error: refs/remotes/origin/master does not point to a valid object!
error: refs/remotes/origin/HEAD does not point to a valid object!
But then .git/objects/pack
gets repopulated and further calls to git fetch
don't complain anymore.
Is it safe to nuke .git/objects/pack*
like this?
Assumptions:
- There are no local-only commits in the repo or any form of git manipulation (like adding/removing objects from the stage), just checking out a specific branch in shallow mode.
- The remote won't be rewriting history for the checked out branches.
- I have no control whatsoever over the contents of the remote repository itself. It's a dependency of my project, but a fast changing one that is only available as git, and I want instructions for automated use in a continuous integration setting. Tips on how to modify the repository itself to make it take less space aren't going to help.
- As I mentioned earlier, 1.9 GiB is for a shallow clone of the one branch I'm interested. It's a lot bigger than that when it's non-shallow, due to it's long history (open-source project that has over 10 years).
- There are other repositories checked out in the same continuous-integration pipeline and I'd like to apply the same reduction of redundant-with-remote info in all of them.
The intent is to reduce as much as possible the amount of space taken by artifacts from a continuous-integration pipeline, but retaining enough information so that a those artifacts could be downloaded and restored to working order in the developer workstation with as little (and as normal) commands as possible.