14

In trying to understand JGit: Retrieve tag associated with a git commit, I arrived at this thread on the JGit mailing list: [jgit-dev] Commits and tags.

In this thread they refer to the peel method of org.eclipse.jgit.lib.Repository:

Peel a possibly unpeeled reference to an annotated tag.

I could only find two mentions of peeling in the Git documentation: git-check-ref-format(1) Manual Page and Git Internals - Maintenance and Data Recovery.

What does the term peel mean in Git? How is it useful? What does it have to do with onions?

Community
  • 1
  • 1
Ed I
  • 7,008
  • 3
  • 41
  • 50

2 Answers2

19

You already know that each object in a repository has a unique SHA-1, and also an associated type and contents. (From the command line, use git cat-file -t sha1 to see the type of the given object, and git cat-file -p sha1 to see the contents, possibly with some pretty-printing applied but generally fairly raw in format.)

An annotated tag object in a git repository contains an SHA-1. (I hope this part is clear and uncontroversial. :-) )

Typically, the SHA-1 inside an annotated tag object—let's call this the "target-ID", and the object thus named the "target" of the tag—is the ID of a commit object. In this case, all is still pretty clear and simple. But what if it's not?

In particular, the target of an annotated tag might be another annotated tag. If this is the case, you must fetch that second annotated tag, which contains yet another target-ID. You must repeat this process, peeling off tag after tag, until you arrive at a non-tag object. Ideally this will be a commit, but of course it could be any of the remaining three kinds of objects (it's not clear what it means to tag a tree or blob though; only a commit makes sense).

This "peeling" process has been compared to peeling an onion, and that's where the phrase originates.

Note that if you're writing a repository health checker, it might be wise to make sure that the chain of tags does not loop (e.g., what if tag 1234567 has 7654321 as its target, and 7654321 is an annotated tag with 1234567 as its target?). (Edit: as remram points out in a comment, this requires "breaking" SHA-1. That means it's almost impossible for it to happen in practice, just as you won't get a tree that recursively points to itself.)

Edit: how to make a tag that points to another tag:

... make a repo with a commit that can be tagged ...
$ git tag -a anno1 -m 'annotated tag'
$ git tag -a anno2 anno1 -m 'another tag'
$ git cat-file -p anno1
object d4ec8b2d465de0c087d645f52ba5040586b8ce0f
type commit
tag anno1
tagger Chris Torek <chris.torek@gmail.com> 1413933523 -0600

annotated tag
$ git cat-file -p anno2
object cd1e0637c348e46c645819ef6de36679052b4b7f
type tag
tag anno2
tagger Chris Torek <chris.torek@gmail.com> 1413934239 -0600

another tag
torek
  • 448,244
  • 59
  • 642
  • 775
  • 2
    This is wrong. Annotated tag objects can only point to commit objects, not other tags. And chains of tags cannot loop for the same reasons that chains of commits don't loop -- the SHA1 function is not invertible. – remram Oct 21 '14 at 19:04
  • 2
    @remram: the point about SHA-1 being non-invertible is valid, except that there are claims to having broken it (I'm not sure whether to believe them) or at least to have the ability to break it. As for tags not pointing to tags, it's true that `git tag -a` doesn't do it by default, but I was able to make a tag that points to another tag; I'll edit the post. It works well enough: `git show anno3` shows the tag pointing to tag `anno1` and then the underlying commit on `anno1` (`anno2` is a demo that `git tag -a` goes for the underlying commit). – torek Oct 21 '14 at 23:24
  • 1
    (Update: it's easier than I expected, I re-did it as `anno1` with `anno2` pointing to `anno1`.) – torek Oct 21 '14 at 23:44
  • @remram "Annotated tag objects can only point to commit objects, not other tags." is false, see https://git-scm.com/book/en/v2/Git-Internals-Git-References – Ulugbek Abdullaev Sep 24 '20 at 10:53
  • Yeah that seems to be the case and has lead me to very interesting situations. I still don't see how you could form a loop though, since references are hashes. – remram Sep 24 '20 at 15:35
  • Wow, old answer resurrected :-) The reason I'd still recommend that a consistency-checker actually check for loops (but wouldn't care if normal Git operations just assumed no loops) is just that sort of break-the-SHA-1 issue that we see now in [How does the newly found SHA-1 collision affect Git?](https://stackoverflow.com/q/42433126/1256452) – torek Sep 24 '20 at 16:27
5

"peeling" refers to dereferencing, e.g. going from a ref to the annotated tag object to a ref on the commit it points to.

The term is also used for the other cases of the ^{xxx} syntax, e.g. to go from commit to tree with abc1234^{tree}.

remram
  • 4,805
  • 1
  • 29
  • 42