2

I've been playing with git internals, and noticed that tree objects can store commit objects, using the ls-tree/mktree format:

0160000 commit <sha1>    name

I'm wondering how the git GC handles such a situation.

The source code refers to this as a directory link since the normally invalid mode bitmask 0160000 happens to be a combination of the directory and link bitmasks, and a gitlink since it

is a link to another git directory.

Looking around I can see that submodule behavior is built on this (paired with a .gitmodules file to know where the other git directory is), but would git choke on it if it found one in the wild, in some other context? What if the commit was instead a local commit within the git object database? Were this the only reference to the commit would that commit not be GC'd, or does the GC not count this as a reference since it assumes it's external?

Chris Keele
  • 3,364
  • 3
  • 30
  • 52
  • 2
    This refers to a commit in a *submodule*, not a commit in the *current* repository. – o11c Mar 14 '16 at 02:45
  • Yeah, I thought that since trees *can* point to commits that the GC might recognize non-submodule commit references to sha1s inside the repo as references to hang on to. Seems not. – Chris Keele Mar 14 '16 at 22:07

1 Answers1

1

It turns out the behavior of this is not subtle––simulated it myself and the commit is indeed picked up for GC.

Demonstration:

git init test && cd test

blob=$(echo "foobar" | git hash-object -w --stdin)
tree=$(echo "100644 blob $blob\tblob" | git mktree)
commit=$(git commit-tree $tree -m "Made sample commit.")

git prune -n # Should show the sha1's for the above objects since no ref points to them

tree_with_commit=$(echo "0160000 commit $commit\tlocal-commit" | git mktree)
git tag commit-tree $tree_with_commit # To register our objects with a ref and avoid GC

git prune -n # Same objects are up for GC––the tree_with_commit reference won't preserve them
Chris Keele
  • 3,364
  • 3
  • 30
  • 52
  • 1
    Needed to add `-e` as in `tree=$(echo -e "100644 blob $blob\tblob" | git mktree)` - see http://stackoverflow.com/a/27283328/281545 – Mr_and_Mrs_D May 07 '16 at 15:01
  • 1
    Could you explain what `tree=$(echo "100644 blob $blob\tblob" | git mktree)` does ? – Mr_and_Mrs_D May 07 '16 at 15:12
  • 1
    @Mr_and_Mrs_D `git mktree` lets you create a new tree object by giving it a line for every object you want to add to the tree. The format of each line must be `"file_permissions object_type object_sha1 \t name_in_tree"`. It returns the sha1 of the created tree object. In this case I'm just taking our previously created blob, whose sha1 is stored in `$blob`, and adding it as a read/writeable blob to a new tree under the name `blob`. – Chris Keele May 09 '16 at 14:13