3

In Wikidata, I want to find an item's country. Either directly if the item has a country directly, or by climbing up the P131s (located in the administrative territorial entity) until I find a country. Here is the query:

?item wdt:P131*/wdt:P17 ?country.

The query above works fine... except when a sub-division used to belong to another country, like for Q25270 (Prishtina). In such case, the result can be anachronistic. That's what I want to fix.

Great news: in such cases we should only consider the unique P131 (located in the administrative territorial entity) that has no P582 (end time) sub-property attached to it, and the problem is solved!

My question: how to alter my query above to achieve that?

Example: Let's say MyItem is in MyStreet is in MyTown is in MyRegion is in MyCountry, I must make sure that MyStreet, MyTown, and MyRegion do not have a P582 (end time).

enter image description here

(If "sub-property" is not the correct term, please let me know the right term and I will fix the question, thanks!)

An attempt

The query below works in most cases, but unfortunately it has a bug: It finds the wrong country in cases where the current country was also the country in the past (for instance Alsace belonged to France until 1871 then to Germany and currently to France again).

SELECT DISTINCT ?country WHERE {
  wd:Q6556803 wdt:P131* ?area .
  ?area wdt:P17 ?country .
  OPTIONAL {
    wd:Q6556803 wdt:P131*/p:P131 [
      pq:P582 ?endTime; ps:P131/wdt:P131* ?area
    ] .
  } .
  FILTER( !BOUND( ?endTime ) ) .
}
Nicolas Raoul
  • 58,567
  • 58
  • 222
  • 373
  • 1
    This clause — `FILTER NOT EXISTS { ?statement pq:P582 ?x }` — has 3 occurences [here](https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/queries/examples). It's an official way. However, see [this proposal](https://www.wikidata.org/wiki/Wikidata:Property_proposal/located_in_present-day_administrative_territorial_entity). – Stanislav Kralin Jun 01 '17 at 08:43
  • 1
    @StanislavKralin: Unfortunately, none of these examples contain a wildcard (`*`), right? My question's challenge is to check the property for **all** of the P131s that climb up to the country. For instance, if MyItem is in MyStreet is in MyTown is in MyRegion is in MyCountry, I must make sure that MyStreet, MyTown, and MyRegion do not have an end time. About the proposal: Interesting and very relevant, thanks! – Nicolas Raoul Jun 01 '17 at 09:37
  • I don't think it's possible. See [this](https://stackoverflow.com/questions/38641984/sparql-applying-limiting-criteria-to-predicates) fun (and unanswered) question, and it's unlikely that Wikidata's reification makes things easier. – Stanislav Kralin Jun 01 '17 at 13:48
  • I don't know a way to do it using SPARQL. I think you need to do it using an external hard-coded programming. – Median Hilal Jun 01 '17 at 14:07

1 Answers1

1

Wikidata uses different properties for direct links and links with extra information. So, for the statement "Prishtina is located in the administrative territorial entity Socialist Autonomous Province of Kosovo", there's the simple triple:

wd:Q25270 wdt:P131 wd:Q646035

And the long form with additional information (the end time):

wd:Q25270 p:P131 wds:Q25270-7df79cec-4938-8b6d-4e11-4dde6f72d73b .

wds:Q25270-7df79cec-4938-8b6d-4e11-4dde6f72d73b ps:P131 wd:Q646035 ;
    pq:P582 "1990-01-01T00:00:00Z"

So, we need to filter out all paths with an end time (pq:582):

SELECT DISTINCT ?s ?sLabel ?country ?countryLabel {
  VALUES ?s {
    wd:Q25270 
  }
  ?s wdt:P131* ?area .
  ?area wdt:P17 ?country .
  FILTER NOT EXISTS {
    ?s p:P131/(ps:P131/p:P131)* ?statement .
    ?statement ps:P131 ?area .
    ?s p:P131/(ps:P131/p:P131)* ?intermediateStatement .
    ?intermediateStatement (ps:P131/p:P131)* ?statement .
    ?intermediateStatement pq:P582 ?endTime .
  }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en" }
}
limit 50

Here, ?intermediateStatement is a statement with an end time on the path from ?s to a country.

This query does seem to time out if there is more than one value set for ?s. Also, the query does not take into account that there might exist multiple links from an item to an area where one has a timestamp and the other doesn't (both paths will be filtered out).

evsheino
  • 2,147
  • 18
  • 20
  • Unfortunately it times out: https://query.wikidata.org/#SELECT%20%3Fcountry%20WHERE%20%7B%0A%20wd%3AQ6556803%20wdt%3AP131%2a%20%3Farea.%0A%20%3Farea%20wdt%3AP17%20%3Fcountry.%0A%20FILTER%20NOT%20EXISTS%20%7B%0A%20%20wd%3AQ6556803%20p%3AP131%2F%28ps%3AP131%2Fp%3AP131%29%2a%20%3Fstatement.%0A%20%20%3Fstatement%20ps%3AP131%20%3Farea.%0A%20%20wd%3AQ6556803%20p%3AP131%2F%28ps%3AP131%2Fp%3AP131%29%2a%20%3FintermediateStatement.%0A%20%20%3FintermediateStatement%20%28ps%3AP131%2Fp%3AP131%29%2a%20%3Fstatement.%0A%20%20%3FintermediateStatement%20pq%3AP582%20%3FendTime.%0A%20%7D%0A%7D Did I do a mistake? – Nicolas Raoul Jun 16 '17 at 06:47
  • It does work if you bind `wd:Q6556803` to a variable. So use my query and just replace `wd:Q25270` with `wd:Q6556803`. – evsheino Jun 18 '17 at 08:45