3

I have some entities in a specific language and I am trying to retrieve the possible IDs from Wikidata that match those names.

For example, I have some German name, let's say "Ministerium für Auswärtige Angelegenheiten" and I can get the top N candidate IDs that correspond to the name like this:

SELECT ?item                                                                                                                                                                                                                                                                                                          
    WHERE                                                                                                                                                                                                                                                                                                
    {                                                                                                                                                                                                                                                                                                    
        ?item rdfs:label "Ministerium für Auswärtige Angelegenheiten"@de                                                                                                                                                                                                                                     
    }                                                                                                                                                                                                                                                                                                    
    LIMIT 2 

and this will give me 2 candidate IDs.

The issue that I have is, if I have a name that contains some inflection, then the exact match won't be in the database and nothing will be returned.

Even in the current example with the name: "Ministerium für Auswärtige Angelegenheiten", if I remove the word "für", I won't get any results returned.

Is there a way to make the search more flexible and return the closest results to the query, even if they are incorrect?

P.S. I am doing it through Python, using the SPARQLWrapper

Porjaz
  • 771
  • 1
  • 8
  • 28

1 Answers1

0

Not using the WQS SPARQL service, IIANM.

For similar usecases, using the full-text search engine might be workable. Take a look at a search query in the API Sandbox, returning some relevant results.

Mormegil
  • 7,955
  • 4
  • 42
  • 77
  • it won't work without "fuer", i.e. "Ministerium Auswärtige Angelegenheiten" – UninformedUser Feb 19 '21 at 13:29
  • Why do you think so? I originally used a wrong URL in the answer, but I had tested it even without the “für”, and the results seem fine. – Mormegil Feb 19 '21 at 13:50
  • I think this could be what I am looking for. I assume that I can use this API through some Python wrapper. – Porjaz Feb 19 '21 at 13:55
  • 1
    ok, now it works. @Porjaz it's a simple HTTP request. Indeed, you could call the service also within SPARQL via SERVICE clause – UninformedUser Feb 19 '21 at 14:08
  • 1
    @Porjaz: Yes, it’s a simple HTTP GET request, but please note you need to use [the real API link](https://www.wikidata.org/w/api.php?action=query&format=json&uselang=de&list=search&srsearch=Ministerium%20Ausw%C3%A4rtige%20Angelegenheiten&srnamespace=0&srlimit=max&srqiprofile=wikibase_config_entity_weight-de&srenablerewrites=1) (shown in the box), the link I included lead to the interactive sandbox. – Mormegil Feb 19 '21 at 14:09