5

I am using wikidata Query Service to fetch data: https://query.wikidata.org/

I have already managed to use an entity's label using 2 methods:

  1. Using wikibase label service. For example:
SELECT ?spouse ?spouseLabel WHERE {
   wd:Q1744 wdt:P26 ?spouse.
   SERVICE wikibase:label {
     bd:serviceParam wikibase:language "en" .
   }
}
  1. Using the rdfs:label property:
SELECT ?spouse ?spouseLabel WHERE {
   wd:Q1744 wdt:P26 ?spouse.
   ?spouse rdfs:label ?spouseLabel. filter(lang(?spouseLabel) = "en").
}

However, it seems that for complex queries, the second method performs faster, in contrary to what the MediaWiki user manual states:

The service is very helpful when you want to retrieve labels, as it reduces the complexity of SPARQL queries that you would otherwise need to achieve the same effect.

(https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual#Label_service)

What does the wikibase add that I can't achieve using just rdfs:label? It seems odd since they both seemingly achieve the same purpose, but rdfs:label method seems faster (which is logical, becuase the query does not need to join data from external sources).

Thanks!

logi-kal
  • 7,107
  • 6
  • 31
  • 43
Shlomi Uziel
  • 868
  • 7
  • 15
  • How do you get a value for ?spouseLabel in the first example when it doesn't appear in the query? – Joshua Taylor Sep 01 '16 at 18:57
  • @JoshuaTaylor Wikidata has this weird non-standard SPARQL extension where they use the SERVICE clause to provide labels for resources. – Jeen Broekstra Sep 01 '16 at 20:50
  • @jeen I gathered that from the answer, but it seems like there's still some magic in knowing what it will be called. (Not that I looked at the docs for it though) – Joshua Taylor Sep 01 '16 at 22:04
  • It isn't magic, the document states the use case: "WDQS will automatically generate labels as follows: If an unbound variable in SELECT is named ?NAMELabel, then WDQS produces the label (rdfs:label) for the entity in variable ?NAME." – Shlomi Uziel Sep 02 '16 at 14:08
  • @ShlomiUziel Well, at least it's documented there well. It seems kind of horrible that you end up with a query that someone unfamiliar with that particular system but familiar with SPARQL would (i) not have any clue why it should produce the results that it it; and (ii) has code that looks like it should be removable because it doesn't contribute to the results, but actually does. I don't know if there are SPARQL linting tools, but I'd expect one to flag this as "doesn't contribute to projected bindings. Hooray. – Joshua Taylor Sep 02 '16 at 20:02
  • @JoshuaTaylor I surely agree, but unfortunately I'm bound to that implementation since I need to query wikidata's data, currently without resources to manage an instance of their data base myself. – Shlomi Uziel Sep 03 '16 at 08:43

1 Answers1

5

As I understand from the documentation, the wikibase label service simplifies the query by removing the need to explicitly search for labels. In that regard it reduces complexity of the query you need to write, in terms of syntax.

I would assume that the query is then expanded to another representation, maybe with the rdfs namespace like in your second option, before actually being resolved.

As per the second option being faster, have you done a systematic benchmarking? In a couple of tries of mine, the first option was faster. I would assume that performance of the public endpoint is anyways subject to fluctuation based upon demand, caching, etc so it may be tricky to draw conclusions on performance for similar queries.

atineoSE
  • 3,597
  • 4
  • 27
  • 31
  • 1
    I tend to agree - I also suspected that the "complexity" term isn't referring to runtime efficiency as I primarily thought. Also, the official documentation does hint that the query is indeed being expanded or somewhat optimized or processed to generate the desired result (as I commented on the question). I agree that thorough benchmarking is required to answer decisively, albeit systematic runs of certain queries gave me some pretty unambiguous results - about 900ms vs 8000 ms with wikibase. Maybe that public endpoint does some optimizations or changes we are not aware of... – Shlomi Uziel Sep 02 '16 at 14:14