9

Often when discussing the power of semantic databases and ontologies, I hear people say that RDF data is versatile because ontologies can be applied to the data to look at it in different ways.

However, in my experience, a dataset is usually tied to particular ontologies by virtue of the predicate, i.e. in the subject-predicate-object, in which the predicate defines some property or relationship according to some ontology. So, for example, in a query about movies, if none of the data makes reference to some person's new "movie ontology", then I can't just use its terms in a query against DBPedia or LinkedMDB, right?

Then I'll occasionally see "linkages" in a data set that essentially joins one particular resource to a similar resource in another dataset that has its own ontology. i.e. linkedmdb has owl:sameAs, but this doesn't seem to be what people mean by applying ontologies to data.

How does it work and how can I use a different ontology about some subject matter in a SPARQL query on a dataset?

Joshua Taylor
  • 84,998
  • 9
  • 154
  • 353
Kristian
  • 21,204
  • 19
  • 101
  • 176

1 Answers1

10

Then I'll occasionally see "linkages" in a data set that essentially joins one particular resource to a similar resource in another dataset that has its own ontology. i.e. linkedmdb has owl:sameAs, but this doesn't seem to be what people mean by applying ontologies to data.

It might not necessarily be owl:sameAs, but I think this is probably the sort of thing that you're looking for. Using RDFS or OWL, you can make a number of different kinds of assertions about properties and classes in such a way that with a little bit of reasoning, you've got a new “view” on your data. For instance, say one ontology defines some classes and properties:

o1:Film a rdfs:Class .
o1:Actor a rdfs:Class .
o1:hasActor a rdf:Property .
            rdfs:domain o1:Film .
            rdfs:range o1:Actor .

Another ontology defines some others:

o2:Movie a rdfs:Class .
o2:Person a rdfs:Class .
o2:Character a rdfs:Class .
o2:hasCharacter a rdf:Property ;
                rdfs:domain o2:Movie ;
                rdfs:range o2:Character .
o2:playsRole a rdf:Property ;
             rdfs:domain o2:Actor ;
             rdfs:range o2:Character .

Now, if you have data expressed according to one ontology, you might use some axioms like these to get some information in terms of other:

o2:Movie rdfs:subClassOf o1:Film .
o1:Film rdfs:subClassOf o2:Movie .

o1:Actor rdfs:subClassOf o2:Person .

That's just a little bit of information, but with an RDFS reasoner, you suddenly know about lots of instances. If you're using a more expressive ontology language than RDFS, say OWL, then you can use some more expressive axioms, e.g.,

Movie ≡ Film
Actor ⊑ Person
hasActor ⊑ hasRole o (inverse playsRole)

With that last axiom in particular, you find out that anyone who plays a role that's in a movie is an actor in the movie. OWL will let you do lots more, too, but this is the, if general idea of ontology or schema mapping. To use this sort of approach, you'd want to define your mapping axioms and apply a reasoner to the union of them and the original dataset.

You can do some more with rule-based reasoning too. For instance, rather than declaring the third OWL axiom above, you could write a rule:

hasRole(?movie,?role) ∧ playsRole(?actor,?role) → hasActor(?movie,?actor)

While applying rules is just another kind of reasoning, it's got a closer connection to SPARQL because you could use SPARQL construct queries to produce data in terms of ontology as a result of queries over data using another. For instance, you could do:

construct {
  ?movie :hasActor ?actor
}
where {
  ?movie :hasRole ?role .
  ?actor :playsRole ?role .
}

You're right though that the idea of data interoperability is sometimes a bit oversold, or at least made to sound a bit easier and more glamorous than it is. In general, to use data, you're going to need to become familiar with the vocabulary it's expressed in. If you want to get some new data using another vocabulary based on the original data, you're going to need to understand the relationships between those vocabularies pretty well, and you're going to need to apply some sort of translation (oftentimes this will be some sort of RDF or OWL reasoner).

Joshua Taylor
  • 84,998
  • 9
  • 154
  • 353
  • 3
    With reference to the last paragraph: yes, familiarity with the actual ontology used to describe the data is useful, if not necessary. The difference, in my opinion, with any other structured representation of data, is given by the fact that such familiarity is verifiable: the use of a reasoner will provide all the knowledge embedded in the representation. This is not true of a database schema or of an XML representation though, where some of the knowledge is in the coding around the data rather than in the data. That said, ontologies can be poorly specified, of course :-) – Ignazio Jan 25 '14 at 10:04
  • @Ignazio if its not too much trouble, could you write an answer and expand on this point some more? – Kristian Jan 27 '14 at 17:56
  • There's literature on the topic, I'll try to dig out a few relevant links. It's not really an answer to the question though, more like an informed opinion. That's why I added it as a comment rather than an answer. – Ignazio Jan 27 '14 at 21:21
  • 2
    http://www.ncbi.nlm.nih.gov/pubmed/21343142 This is an example of what I have in mind. – Ignazio Jan 27 '14 at 21:42