1

I'm trying to retrieve all labels and descriptions in different languages from a instances of a specific class from Wikidata.

I have followed the answer to a similar question (https://stackoverflow.com/a/49129274/3873799), which proposes the following approach:

SELECT DISTINCT *
WHERE 
{
  ?item (ps:P31) wd:Q92275707
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en".
            ?item rdfs:label ?label_EN.
            ?item schema:description ?desc_EN .
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "it".
            ?item rdfs:label ?label_IT.
            ?item schema:description ?desc_IT .
  } hint:Prior hint:runLast false.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "de".
            ?item rdfs:label ?label_DE.
            ?item schema:description ?desc_DE .                
  } hint:Prior hint:runLast false.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "fr".
            ?item rdfs:label ?label_FR.
            ?item schema:description ?desc_FR .
  } hint:Prior hint:runLast false.
} LIMIT 100

Try full query here.

The problem with this query is that it returns nothing for descriptions; more importantly, I do not get the label that I would expect, but I always get a label in the form of statement/Q681554-7152e1c3-43a8-0b46-824d-62fd44d480ca; why is that?

The result that I would expect is, for example taking Q681554:

  • Label = Toron [English]; Toron [Italian]; nothing for the other languages.
  • Description = castle ruin [English]; nothing for the other languages.

Alternatively, the following approach (try query here):

SELECT DISTINCT *
{
  ?item wdt:P31 wd:Q92275707;
        rdfs:label ?label;
        schema:description ?desc 
        filter(lang(?label) = 'en' && lang(?desc) = 'en' || lang(?label) = 'it' && lang(?desc) = 'it' || lang(?label) = 'fr' && lang(?desc) = 'fr') #optional filtering
}
ORDER BY ?item

Does return the labels and descriptions in different languages, but the problem is that multiple rows are given per each of the different languages found for label/description.

How can these queries be corrected, and is there a preferred option, performance-wise?

alelom
  • 2,130
  • 3
  • 26
  • 38
  • 1
    the problem with the first query is the it does use `ps:P31` - this leads to the statement and not the direct value. Use the first query, but not with the statement property: `?item wdt:P31 wd:Q922757071` – UninformedUser Mar 13 '23 at 05:50
  • your second query indeed returns multiple rows, because it assigns all labels to the same variable, thus, the resultset has onlu one "column" for the labels. You could do the same like in your first query though, i.e. assign the label of each language to a separate variable and indeed then use only the corresponding part in the filter, put each into an `OPTIONAL` block. like `select * {?item wdt:P31 wd:Q92275707 . OPTIONAL{?item rdfs:label ?l_fr filter(lang(?l_fr) = "fr")} OPTIONAL{?item rdfs:label ?l_en filter(lang(?l_en) = "en")} }` – UninformedUser Mar 13 '23 at 05:52
  • Thank you @UninformedUser, I thought it could be something trivial I missed on my end. I am almost tempted to delete the answer, however I still think it could be useful for others. Perphaps you could add an answer with the content of your comments, maybe adding some observation on the performance of one VS the other approach? – alelom Mar 13 '23 at 08:26
  • I also noticed that the first query, under label, returns the Q ID when a language label cannot be found. The second query returns nothing when a label is not found, which seems more appropriate. – alelom Mar 13 '23 at 08:34

0 Answers0