22

How can I query Wikidata to get all items that have labels contain a word? I tried this but didn't work; it retrieved nothing.

SELECT ?item ?itemLabel WHERE {
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en".
    ?item rdfs:label ?itemLabel.  
  }
FILTER(CONTAINS(LCASE(?itemLabel), "keyword"))
}
LIMIT 1000
fattah.safa
  • 926
  • 2
  • 14
  • 36
  • What is `wikibase:label`? Without prefixes it's hard to say what's going wrong. – UninformedUser Jul 22 '16 at 18:06
  • PREFIX wikibase: – fattah.safa Jul 22 '16 at 21:08
  • And where is the `wikibase:language` information in this dataset? Without, the join is obviously empty in the SERVICE part which is executed as a single SPARQL query.. I think it could work if you put the first triple outside of the SERVICE clause. – UninformedUser Jul 23 '16 at 00:53
  • Thanks AKSW for your answer. I tried it but got "QUERY TIMEOUT ERROR: SPARQL-QUERY: queryStr=SELECT ?item ?itemLabel WHERE { SERVICE wikibase:label { bd:serviceParam wikibase:language "en". } ?item rdfs:label ?itemLabel. FILTER(CONTAINS(LCASE(?itemLabel), "palestine")) }" – fattah.safa Jul 23 '16 at 07:15
  • 1
    I meant it the other way around. I thought you want to use the labels from the ontology graph `http://wikiba.se/ontology-1.0.owl#` which you specified with the SERVICE clause. And this one does not contain the property `wikibase:language`, therefore you should put this one outside of the SERVICE clause, not the other one. But to be honest, it's not clear what you want to get with your query. Especially, the graph that you specify with `SERVICE wikibase:label` should be what? You use a prefixed URI for the property, but maybe you want just the ontology. – UninformedUser Jul 23 '16 at 11:55
  • 1
    Where do you run your query? And why can't you simply do this on the Wikidata SPARQL endpoint without using the SERVICE clause? – UninformedUser Jul 23 '16 at 11:56
  • I just updated the query to: "SELECT ?item ?itemLabel WHERE { ?item rdfs:label ?itemLabel. FILTER(CONTAINS(LCASE(?itemLabel), "keyword"@en)). } limit 3 ". It works in most of the cases but returns "Query deadline is expired" – fattah.safa Jul 29 '16 at 22:23

4 Answers4

15

Yes, you can search by label, for example:

SELECT distinct ?item ?itemLabel ?itemDescription WHERE{  
  ?item ?label "Something"@en.  
  ?article schema:about ?item .
  ?article schema:inLanguage "en" .
  ?article schema:isPartOf <https://en.wikipedia.org/>. 
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }    
}

see it on Query page.

Alexan
  • 8,165
  • 14
  • 74
  • 101
12

Following your question and the useful comments provided, I ended up with this query

SELECT ?item ?itemLabel
WHERE { 
  ?item rdfs:label ?itemLabel. 
  FILTER(CONTAINS(LCASE(?itemLabel), "city"@en)). 
} limit 10

For which I got those results

item          itemLabel
wd:Q515       city
wd:Q7930989   city
wd:Q15253706  city
wd:Q532039    The Eternal City
wd:Q1969820   The Eternal City
wd:Q3986838   The Eternal City
wd:Q7732543   The Eternal City
wd:Q7737016   The Golden City
wd:Q5119      capital city
wd:Q1555      Guatemala City

try it here

innovimax
  • 440
  • 5
  • 8
  • 4
    Just an update, this query is terminated with timeout exception in most of the cases (for other labels) – fattah.safa Nov 08 '18 at 19:36
  • @fattah.safa I just tried it again and it took less than 2 seconds to complete – innovimax Apr 09 '22 at 14:34
  • as @fattah.safa said, it raises timeout in _many cases_ for **other** labels. I tried to change "city" by other lowercase strings and some work, but some doesn't: i.e. "washington" or "oslo" work, but for example "curitiba" or "ourense" [ends in timeout](https://w.wiki/5DJi), although these places have English [wikidata](https://www.wikidata.org/wiki/Q99151) and [wikipedia](https://en.wikipedia.org/wiki/Ourense) entries. Any clues? – abu May 28 '22 at 12:12
3

As of today (June 2020), the best way to do this seems to be using these CirrusSearch extensions. The following does a substring search in all English labels and comes back with 10,000 results in <20 seconds. I believe it also searches in aliases and descriptions.

SELECT DISTINCT ?item ?label
WHERE
{
  SERVICE wikibase:mwapi
  {
    bd:serviceParam wikibase:endpoint "www.wikidata.org";
                    wikibase:api "Generator";
                    mwapi:generator "search";
                    mwapi:gsrsearch "inlabel:city"@en;
                    mwapi:gsrlimit "max".
    ?item wikibase:apiOutputItem mwapi:title.
  }
  ?item rdfs:label ?label. FILTER( LANG(?label)="en" )

  # … at this point, you have matching ?item(s) 
  # and can further restrict or use them
  # as in any other SPARQL query

  # Example: the following restricts the matches
  # to college towns (Q1187811) only

  ?item wdt:P31 wd:Q1187811 .
}

Link to this query

Matthias Winkelmann
  • 15,870
  • 7
  • 64
  • 76
  • Seems much faster, but I don't understand how to extend the query, for instance if the label must contain city and the entity be an instance of a place, not a movie etc. etc. – G M Aug 03 '21 at 16:40
  • 1
    @GM I’ve edited the example to show how to further restrict the entities. – Matthias Winkelmann Aug 07 '21 at 09:53
0

As stated above, querying with case-insensitivity and truncation is very slow in SPARQL query service. I found this project on github: https://github.com/inventaire/entities-search-engine It sets up an ElasticSearch index which allows fast queries for use-cases like autocompletion.

CK_One
  • 97
  • 10