I am trying to extract the hierarchy of Wikipedia category or Yago classification for DBpedia resources using the SPARQL endpoint. For instance, I would like to find out all the possible categories and classes in hierarchical form of entity, say, http://dbpedia.org/resource/Nokia, like Thing → Organization → Company → … → Nokia.
-
I tried extracting the categories using sparql but couln't implement it successfully. can u give any suggestions? – learner Jun 04 '12 at 15:26
-
The reason I asked is that it is usually more helpful if you actually show, in detail, what queries you've tried and why it doesn't quite do what you expect. We can then give you suggestions on how to fix the problems. – Jeen Broekstra Jun 04 '12 at 21:50
-
I have been able to do specific queries for some known entities like the above one but i have a rather big list entities that i want to automate to get the desired results. – learner Jun 05 '12 at 22:24
1 Answers
A simple SPARQL select can retrieve the information that you're interested in, though it won't be arranged hierarchically. You're interested in getting all the types of a resource, as well as the rdfs:subClassOf
relations between them. Here's a very simple query for Nokia that can be run on the DBpedia SPARQL endpoint
SELECT * WHERE {
dbpedia:Nokia a ?c1 ; a ?c2 .
?c1 rdfs:subClassOf ?c2 .
}
If you treat each pair of classes in that result set as a directed edge and perform a topological sort , then you'll see the hierarchy of the classes to which the Nokia resource belongs. In fact, since it is probably convenient to treat this as a graph, you can get it in the form of an RDF graph by using a SPARQL construct query.
CONSTRUCT WHERE {
dbpedia:Nokia a ?c1 ; a ?c2 .
?c1 rdfs:subClassOf ?c2 .
}
The construct query produces this graph (in N3 format):
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix dbpedia-owl: <http://dbpedia.org/ontology/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix yago: <http://dbpedia.org/class/yago/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dbpedia: <http://dbpedia.org/resource/> .
dbpedia-owl:Agent rdfs:subClassOf owl:Thing .
dbpedia-owl:Company rdfs:subClassOf dbpedia-owl:Organisation .
dbpedia-owl:Organisation rdfs:subClassOf dbpedia-owl:Agent .
yago:CompaniesBasedInEspoo rdfs:subClassOf yago:Company108058098 .
dbpedia:Nokia rdf:type yago:CompaniesListedOnTheHelsinkiStockExchange ,
owl:Thing ,
yago:CompaniesBasedInEspoo ,
dbpedia-owl:Agent ,
yago:DisplayTechnologyCompanies ,
yago:ElectronicsCompaniesOfFinland ,
dbpedia-owl:Company ,
dbpedia-owl:Organisation ,
yago:Company108058098 ,
yago:CompaniesEstablishedIn1865 .
yago:CompaniesEstablishedIn1865 rdfs:subClassOf yago:Company108058098 .
yago:CompaniesListedOnTheHelsinkiStockExchange rdfs:subClassOf yago:Company108058098 .
yago:DisplayTechnologyCompanies rdfs:subClassOf yago:Company108058098 .
yago:ElectronicsCompaniesOfFinland rdfs:subClassOf yago:Company108058098 .
Remarks
The queries above retrieve the rdf:type
hierarchy for Nokia. In the question, you also mention Wikipedia categories. DBpedia resources are associated with the Wikipedia categories to which their corresponding articles belong by the dcterms:subject
property. Those Wikipedia categories are then structured hierarchically by skos:broader
. These really are not types for the individuals though. For instance, the data contain:
dbpedia:Nokia dcterms:subject category:Finnish_brands
category:Finnish_brands skos:broader category:Brands_by_country
While it probably makes sense to say that Nokia is a Finnish_brand, it makes much less sense to say that Nokia is a Brand_by_country.

- 84,998
- 9
- 154
- 353