0

I want to retrieve informations from soccer player wikipedia inforboxes with the following properties (name, team, team number, apperances, goals) using the URIs returned by this wikidata query:

SELECT ?SoccerPlayer ?SoccerPlayerLabel ?Team ?TeamLabel ?TeamNumber ?numMatches ?numGoals ?startTime ?article WHERE 
{?SoccerPlayer wdt:P106 wd:Q937857; 
               p:P54 ?stmt . 
               ?stmt ps:P54 ?Team; 
               pq:P1350 ?numMatches; 
               pq:P1351 ?numGoals; 
               pq:P580 ?startTime . 
               optional {?stmt pq:P1618 ?TeamNumber} filter not exists {?SoccerPlayer p:P54/pq:P580 ?startTimeOther filter(?startTimeOther > ?startTime)} 
               FILTER(?startTime >= "2018-01-01T00:00:00Z"^^xsd:dateTime).
               OPTIONAL { ?article schema:about ?SoccerPlayer . 
               ?article schema:isPartOf <https://en.wikipedia.org/> . } 
               SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".} } limit 200
amine4392
  • 79
  • 6
  • Wikidata != Wikipedia ... DBpedia is the extraction of Wikipedia Infoboxes – UninformedUser Feb 05 '20 at 13:22
  • I don't wanna use DBpedia, it has some latency-time to extract data, even DBpedia-Live – amine4392 Feb 05 '20 at 13:58
  • Ok, but Wikidata has nothing to do with the Wikipedia infoboxes - that's a fact ... if you're using Wikidata, you can only query what is in Wikidata - obviously – UninformedUser Feb 05 '20 at 15:26
  • Does this answer your question? [How to extract information from a Wikipedia infobox?](https://stackoverflow.com/questions/33862336/how-to-extract-information-from-a-wikipedia-infobox) – Tgr Feb 07 '20 at 09:21

1 Answers1

1

Wikipedia infoboxes are plain text which can not be queried.

Instead, use either DBpedia or Wikidata. DBpedia is likely more complete than Wikidata if it comes to data stored on English Wikipedia infoboxes but can not provide you much information beyond that. In contrast, Wikidata aggregated data from various sources and can provide information about entities which do not have an Wikipedia article.

Pascalco
  • 2,481
  • 2
  • 15
  • 31
  • But neither wikidata or dbpedia are as up-to-date as wikipedia, and it's a crucial issue in my case because I tend to compare wikipedia and wikidata data freshness, so I can't use wikidata. Is there any other way to retrive the values (name, team, team number, apperances, goals) by using the soccer players URIs? – amine4392 Feb 05 '20 at 18:40
  • In that case the only option I see is to parse the text by yourself. If you know the article title, you can call `https://en.wikipedia.org/w/api.php?action=query&format=json&prop=revisions&rvprop=content&rvslots=*&titles=TITLE`. Then you can search for `{{Infobox football biography` and try to extract your data. – Pascalco Feb 05 '20 at 19:23
  • @MohamedAmineFerradji but what do you expect now? Do the extraction by yourself if you're not happy with Wikidata or DBpedia. You can also contribute to those projects, those are not commercial projects ... – UninformedUser Feb 06 '20 at 07:05
  • @MohamedAmineFerradji *"because I tend to compare wikipedia and wikidata data freshness"* - I hope you're not planning to submit a paper. Also not sure how you would compare "freshness". Anyways, there is nothing but wikipedia itself if you want to do comparison. wikidata is **not** based on Wikipedia and DBpedia is not sufficient in your opinion. that means, do the fact extraction by yourself. Case closed. If you're thinking this is trivial, the DBpedia are always happy to see contributions. Good luck ... – UninformedUser Feb 06 '20 at 07:08
  • @UninformedUser yes I'm planning to submit a paper, considering that both wikipedia and wikidata rely on the same concept for informations publication but wikipedia seems more up-to-date than wikidata – amine4392 Feb 06 '20 at 09:59