Let me put this up front as it may be the most relevant part: For blobs referenced by unreferenced trees, these typically come from using git write-tree
. Some Git scripts use this command as a quick way to abort if the index contains unmerged entries.
In general, unreferenced items are normal enough; they're eventually collected and discarded by git gc
, usually as a result of a background automatic git gc --auto
.
Besides ojdo's answer, consider this:
- Get all commits using
git log --pretty=tformat:'%T|%h|%s|%aN|%aE'
The git log
command does a revision (commit-graph) walk starting from the specified revisions, or from HEAD
if no starting revision is provided. Some commits may be reachable only from some specific refs.
Even if you add --branches
here, this only starts from all branches; some commits might be reachable only from some specific tag, or from a remote-tracking name. Using --all
augments this to start from all refs ... but this still omits non-ref references, such as ORIG_HEAD
and reflog entries.
Both git fsck
and git gc
need a fancier method by which they can find all references, including hidden ones. Getting this is actually pretty hard, and was broken between Git 2.5—where git worktree add
was first introduced—and Git 2.15, where the bugs were fixed: we must not only consult all refs and reflogs, we must also look at all per-work-tree refs (including each one's HEAD) and each work-tree's index. Git 2.5 through 2.14 failed to check the per-work-tree refs and would thus incorrectly garbage collect expired (via prune-time) loose objects that were in use in added work-trees.
Git's index never contains any tree object ID in the primary section (the one listed by git ls-files --stage
). Only blob objects, including both regular files and symbolic links, and gitlinks appear in this section of the index. Gitlinks hold commit hash IDs from other repositories and must be ignored. However, there are extension records in the index. As far as I know these extension records don't count for liveness, so a tree extension would perhaps become invalid. This might not be the case—perhaps a T
, R
, E
, E
record does count as keeping a tree object live—but given that they're supposed to be ignorable, I suspect they're not. See the technical documentation file on the index for more.