0

I'm trying to get some company data from wikidata but for some reason it doesn't work for companies as "Groupe La Poste" or "Google"

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT
?company ?alternative ?companyLabel ?isin ?web ?country ?countryLabel ?inception ?employees ?employeesDate ?hq ?hqLabel ?countryHq ?countryHqLabel

WHERE
{
     ?article schema:inLanguage "en" .
     ?article schema:isPartOf <https://en.wikipedia.org/>.
     ?article schema:about ?company .

     ?company p:P31/ps:P31/wdt:P279 wd:Q4830453.
     ?company wdt:P946 ?isin.
     ?company rdfs:label "Groupe La Poste"@en.
     OPTIONAL {?company wdt:P856 ?web.}
     OPTIONAL {?company wdt:P571 ?inception.}
     OPTIONAL {?company wdt:P17 ?country.}
     OPTIONAL {?company p:P1128 ?employeesStatement.}
     OPTIONAL {?employeesStatement ps:P1128 ?employees.}
     OPTIONAL {?employeesStatement pq:P585 ?employeesDate.}
     FILTER NOT EXISTS {
        ?employeesStatement pq:585 ?employeesDate1
        FILTER (?employeesDate1 > ?employeesDate)
     }
     OPTIONAL {?company wdt:P159 ?hq.}
     OPTIONAL {?hq wdt:P17 ?countryHq.}
     OPTIONAL {?company skos:altLabel ?alternative.
              FILTER (LANG (?alternative) = "en")}
     SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }    
} 

Also, for companies as Ubisoft, I don't get the last employees date, I get 2 of them. I'm almost new in all this. What am I doing wrong?

UPDATE 1: I added the * after the subclass chain and I put all what's related to the ?employeesDate in a single OPTIONAL.

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT
?company ?alternative ?companyLabel ?isin ?web ?country ?countryLabel ?inception ?employees ?employeesDate ?hq ?hqLabel ?countryHq ?countryHqLabel

WHERE
{
     ?article schema:inLanguage "en" .
     ?article schema:isPartOf <https://en.wikipedia.org/>.
     ?article schema:about ?company .

     ?company p:P31/ps:P31/wdt:P279* wd:Q4830453.
     ?company rdfs:label "Groupe La Poste"@en.

     OPTIONAL {?company wdt:P946 ?isin.}
     OPTIONAL {?company wdt:P856 ?web.}
     OPTIONAL {?company wdt:P571 ?inception.}
     OPTIONAL {?company wdt:P17 ?country.}
     OPTIONAL {?company p:P1128 ?employeesStatement. 
               ?employeesStatement ps:P1128 ?employees. 
               ?employeesStatement pq:P585 ?employeesDate. } 
     FILTER NOT EXISTS { ?company pq:P585 ?otherEmployeesDate FILTER (?otherEmployeesDate > ?employeesDate) }
     OPTIONAL {?company wdt:P159 ?hq.}
     OPTIONAL {?hq wdt:P17 ?countryHq.}
     OPTIONAL {?company skos:altLabel ?alternative.
              FILTER (LANG (?alternative) = "en")}
     SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }    
}
  • 1
    did you check if all the data is there for e.g. go to the page of "Groupe La Poste": https://www.wikidata.org/wiki/Q373724 - 1) you can see that there is no `wdt:P946` property and 2) it is a direct instance of `wd:Q4830453` but your path `p:P31/ps:P31/wdt:P279` requires it to by and instances of a subclass of `wd:Q4830453`. The path should be `p:P31/ps:P31/wdt:P279*` and `?company wdt:P946 ?isin.` put inside an `OPTIONAL` – UninformedUser Apr 28 '20 at 07:17
  • regarrding the other issue, try `OPTIONAL {?company p:P1128 ?employeesStatement. ?employeesStatement ps:P1128 ?employees. ?employeesStatement pq:P585 ?employeesDate. } FILTER NOT EXISTS { ?company p:P1128/pq:P585 ?otherEmployeesDate FILTER (?otherEmployeesDate > ?employeesDate) }` instead of putting everything in a separate `OPTIONAL` because the scope matters – UninformedUser Apr 28 '20 at 07:57
  • Thank you for clarifying the first part. It now works for the companies that were not working anymore. Regarding the employee's date, I still get multiple dates. Is there a way to get only the latest one? I was trying to reproduce the result from https://stackoverflow.com/questions/36181713/sparql-query-to-get-only-results-with-the-most-recent-date, but it didn't seem to work – Arhiliuc Cristina Apr 28 '20 at 09:12
  • that's what i showed in the second part of my comment. you have to use a single `OPTIONAL` for the whole employee statement triple patterns and use a filter to check for another statement:`OPTIONAL {?company p:P1128 ?employeesStatement. ?employeesStatement ps:P1128 ?employees. ?employeesStatement pq:P585 ?employeesDate. } FILTER NOT EXISTS { ?company p:P1128/pq:P585 ?otherEmployeesDate FILTER (?otherEmployeesDate > ?employeesDate) }` – UninformedUser Apr 28 '20 at 09:16
  • I put it in only one optional, but it still gives me multiple dates. I copied what you gave me, and I modified "p:P1128/pq:P585" to "pq:585" because with both it didn't return any result for Ubisoft. I will update my code. Btw, I still get no result for "Groupe La Poste" – Arhiliuc Cristina Apr 28 '20 at 09:23
  • I didn't get the results for La Poste because there was no data regarding the employees, so I fixed it. – Arhiliuc Cristina Apr 28 '20 at 09:40
  • I don't know what you were doing. I put it inside an OPTIONAL, thus, it worked for `La Poste`. Also, still not sure what you mean but not working, the query works for Ubisoft as expected – UninformedUser Apr 28 '20 at 10:17
  • and `?company pq:P585 ?otherEmployeesDate` is wrong, why did you change it and don't use my code? `pq:P585` is a staement qualifier, thus, it is only attached to statements that's why I did `?company p:P1128/pq:P585 ?otherEmployeesDate` which is nothing more than a shortcut for `?company p:P1128 ?otherEmployeeStatement . ?otherEmployeeStatement pq:P585 ?otherEmployeesDate` – UninformedUser Apr 28 '20 at 10:19
  • 1
    Never mind, it works. I've probably done something wrong before. Could you post it as an answer so I can accept it? – Arhiliuc Cristina Apr 28 '20 at 10:21

0 Answers0