2

I am using the following query against wikidata;

SELECT ?country ?countryLabel
      WHERE
      {
        ?country   wdt:P30 wd:Q46;
                   wdt:P31 wd:Q6256.
        SERVICE wikibase:label { bd:serviceParam wikibase:language
        "[AUTO_LANGUAGE],en". }
      }

where P30 is continent; Q46 is Europe; P31 is Instance Of and Q6256 is country;

https://query.wikidata.org/#SELECT%20%3Fcountry%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46%3B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20wdt%3AP31%20wd%3AQ6256.%0A%20%20%20%20%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%0A%20%20%20%20%20%20%20%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%7D%0A%20%20%20%20%20%20%7D

Yet this query only returns 15 countries of Europe. For instance Sweden is not returned even though Sweden appears to match the query at https://www.wikidata.org/wiki/Q34

So even though the query seems to be correct yet it is missing out many countries. Any ideas on how to resolve this?

While comparing the two entries; one for Germany/Sweden (which do not show up) and Norway which does show up, the difference I could find was that Germany/Sweden has a preferred rank for Sovereign State while just a normal rank for Country. This could be a reason where the WHERE clause decides to only show the preferred rank if it exists; and skip the remaining statements. If this be the case and I suspect it is; I wonder if there is a way to override the behavior of the query engine to search through all statements with either a preferred rank or a normal rank.

logi-kal
  • 7,107
  • 6
  • 31
  • 43
nishant
  • 736
  • 1
  • 12
  • 22
  • `wdt:P31/wdt:P279*` would return much more results. Clearly, not all of them what you would expect to get – UninformedUser Mar 20 '19 at 04:24
  • Not sure I understand what references has to do with it. For instance Portugal has 0 references for instance of country; yet it shows up. https://www.wikidata.org/wiki/Q45. I feel it might be due to the preferred rank masking out the normal ranks. – nishant Mar 20 '19 at 05:21
  • 1
    https://stackoverflow.com/a/47100906/7879193 – Stanislav Kralin Mar 20 '19 at 06:19

1 Answers1

0

I am getting a better selection of countries when going around the truthy's by using statements. The statements are able to pull out all the statements even those with normal ranks.

SELECT DISTINCT ?country ?countryLabel
      WHERE
      {
        ?country   wdt:P30 wd:Q46.
        ?country p:P31 ?country_instance_of_statement .
        ?country_instance_of_statement ps:P31 wd:Q6256 .
        SERVICE wikibase:label { bd:serviceParam wikibase:language
        "[AUTO_LANGUAGE],en". 
        }
        filter not exists{?country p:P31/ps:P31 wd:Q3024240 }
      } 
      order by ?countryLabel

I still have a few extra countries showing up; such as German Empire. But I think that's a different problem to fix.

https://query.wikidata.org/#SELECT%20distinct%20%3Fcountry%20%3Fcountry_instance_of_statement%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46.%0A%20%20%20%20%20%20%20%20%3Fcountry%20p%3AP31%20%3Fcountry_instance_of_statement%20.%0A%20%20%20%20%20%20%20%20%3Fcountry_instance_of_statement%20ps%3AP31%20wd%3AQ6256%20.%0A%20%20%20%20%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%0A%20%20%20%20%20%20%20%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%0A%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%20%20filter%20not%20exists%7B%3Fcountry%20p%3AP31%2Fps%3AP31%20wd%3AQ3024240%20%7D%0A%20%20%20%20%20%20%7D%20%0A

Note that country_instance_of_statement captures all the statements irrespective of rank. And once I have those than I use 'ps:P31 wd:Q6256' to pull out those that have country ("wd:Q6256") as the object.

I have added in suggestions from @AKSW above.

And for those who want another approach using end time for country, this is the sparql

SELECT distinct ?country ?countryLabel
      WHERE
      {
        ?country   wdt:P30 wd:Q46.
        ?country p:P31 ?country_instance_of_statement .
        ?country_instance_of_statement ps:P31 wd:Q6256 .
        filter not exists {?country_instance_of_statement pq:P582 ?endTime }
        SERVICE wikibase:label { bd:serviceParam wikibase:language
        "[AUTO_LANGUAGE],en". 
        }
      } 
      order by ?countryLabel

https://query.wikidata.org/#SELECT%20distinct%20%3Fcountry%20%3Fcountry_instance_of_statement%20%3FcountryLabel%0A%20%20%20%20%20%20WHERE%0A%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%3Fcountry%20%20%20wdt%3AP30%20wd%3AQ46.%0A%20%20%20%20%20%20%20%20%3Fcountry%20p%3AP31%20%3Fcountry_instance_of_statement%20.%0A%20%20%20%20%20%20%20%20%3Fcountry_instance_of_statement%20ps%3AP31%20wd%3AQ6256%20.%0A%20%20%20%20%20%20%20%20filter%20not%20exists%20%7B%3Fcountry_instance_of_statement%20pq%3AP582%20%3FendTime%20%7D%0A%20%20%20%20%20%20%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%0A%20%20%20%20%20%20%20%20%22%5BAUTO_LANGUAGE%5D%2Cen%22.%20%0A%20%20%20%20%20%20%20%20%7D%0A%20%20%20%20%20%20%7D%20%0A

nishant
  • 736
  • 1
  • 12
  • 22
  • 1
    yeah, sometimes it's really weird with the data. You could use a filter to ignore historical countries with `filter not exists{?country p:P31/ps:P31 wd:Q3024240 }`. You should also do `SELECT distinct ?country ?countryLabel` because for Switzerland there are two statements. So in the end, we get 51 countries which is correct. – UninformedUser Mar 20 '19 at 06:59
  • @AKSW - Thanks, I was thinking of testing to see if there is no "end time" in country, but this would do as well. I have made the changes but I still have 52 countries with Kingdom of the Netherlands showing up in addition to Netherlands. – nishant Mar 20 '19 at 07:40
  • 1
    your query does also return Switzerland twice because of two bindings for `?country_instance_of_statement` variable. Once you remove it and use `DISTINCT` it works. – UninformedUser Mar 20 '19 at 09:38
  • Regarding Netherlands, yeah - weird. But *Netherlands* is just the country in Europe while *Kingdom of the Netherlands* does also include its colonies in the Caribbean Sea – UninformedUser Mar 20 '19 at 09:40
  • Well, I double checked the result. What I can say, Austria is missing here. Not sure why but it's not an instance of country. Maybe state or sovereign state is more appropriate? – UninformedUser Mar 20 '19 at 09:58
  • 1
    `SELECT distinct ?country ?countryLabel WHERE { ?country wdt:P30 wd:Q46. ?country p:P31 ?country_instance_of_statement . ?country_instance_of_statement ps:P31 wd:Q3624078 filter not exists{?country p:P31/ps:P31 wd:Q3024240 } SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } } order by ?countryLabel` – UninformedUser Mar 20 '19 at 10:00
  • I am going to leave it be by country (not sovereign state) else it might cause issues in other continents. Austria should be fixed in wikidata to have a country property. @AKSW - per your suggestion, I have removed the additional binding for ?country_instance_of_statement and put in an ORDER BY. – nishant Mar 20 '19 at 16:25