2

jq -r '."@graph"[]["rdfs:label"]' 9.0/schemaorg-all-http.jsonld works but jq -r '."@graph"[].["rdfs:label"]' 9.0/schemaorg-all-http.jsonld does not and I don't understand why .["rdfs:label"] does not need the dot. https://stackoverflow.com/a/39798796/308851 suggests it needs .name after [] and https://stedolan.github.io/jq/manual/#Basicfilters says

For example .["foo::bar"] and .["foo.bar"] work while .foo::bar does not,

Where did the dot go?

chx
  • 11,270
  • 7
  • 55
  • 129

2 Answers2

2

The dot serves two different purposes in jq:

  • A dot on its own means "the current object". Let's call this the identity dot. It can only appear at the start of an expression or subexpression, for example at the very start, or after a binary operator like the | or + or and, or inside an opening parenthesis (.
  • A dot followed by a string or an identifier means "retrieve the named field of the current object". Let's call this an indexing dot. Whatever is to the left of it needs to be a complete subexpression, for example a literal value, a parenthesised expression, a function call, etc. It can't appear in any of the places the identity dot can appear.

The thing to understand is that in the square bracket operators, the dot shown in the documentation is an identity dot - it's not actually part of the operator itself. The operator is just the square brackets and their contents, and it needs to be attached to another complete expression.

In general, both square bracket operators (e.g. ["foo"] or [] or [0] or [2:5]) and object identifier indexing operators (e.g. .foo or ."foo") can be appended to another expression. Only the object identifier indexing operators can appear "bare" with no expression on the left. Since the square bracket operators can't appear bare, you will typically see them in the documentation composed after an identity dot.

These are all equivalent:

.foo            # indexing dot
."foo"          # indexing dot
. .foo          # identity dot and indexing dot
. | .foo        # identity dot and indexing dot
.["foo"]        # identity dot
. | .["foo"]    # two identity dots

So the answer to your question is that the last dot in ."@graph"[].["rdfs:label"] isn't allowed because:

  • It can't be an identity dot because it has an expression on the left.
  • It can't be an indexing dot because it doesn't have an identifier or a string on the right, it has a square bracket.

All that said, it looks like newer versions of jq are going to extend the syntax to allow square bracket operators immediately after an indexing dot, and having the intuitive meaning of just applying that indexing operation the same as if there had been no dot, so hopefully you won't need to worry about the difference in the future.

Weeble
  • 17,058
  • 3
  • 60
  • 75
  • I had left a comment saying your terminology is confusing because it's inconsistent with the docs, but it's the docs that are wrong. It identifies `.[...]` as an index just like `.id`, but it's really just `[...]`. [jqplay](https://jqplay.org/s/BwyN_lqXce) – ikegami Oct 12 '21 at 17:22
  • I know, it's a bit confusing. I understand _why_ the docs are written like that. After all, `.["foo"]` is a complete filter, which demonstrates indexing. And it's necessary to distinguish it from `["foo"]` which is a complete filter that constructs an array. But explaining all these concepts at once is hard. Happy to revise the answer if you have suggestions. – Weeble Oct 12 '21 at 22:07
  • yeah, but `[]` (with an explicit mention of `.[]` for clarity) could used instead of `.[]` – ikegami Oct 12 '21 at 22:31
2

Using the terminology of the jq manual, jq expressions are fundamentally composed of pipes and what it calls "basic filters". The first filter under the heading "Basic Filters" is the identify filter, .; and .[] is the "Array/Object Value Iterator".

From this perspective, that is, from the perspective of pipes-and-basic-filters, the expression under consideration ."@graph"[]["rdfs:label"] can be viewed as an abbreviated form of the pipeline:

.["@graph"] | .[] | .["rdfs:label"]

So from this perspective, the question is what abbreviations are allowed. One of the most important abbreviation rules is:

E | .[] #=> E[]

Another is:

.["<string>"] #=> ."<string>"

Application of these rules yields the simplified expression.

So perhaps the basic answer to the "why" in this question is: for convenience. :-)

peak
  • 105,803
  • 17
  • 152
  • 177