2

So I have an ontology I've built in Protege which has annotations and sub-annotations. What I mean by that is that a concept might have a definition and that definition might have a comment.

So you might have something like (s,p,o):

'http://purl.fakeiri.org/ONTO/1111' --> 'label' --> 'Term'

'Term' --> 'comment' --> 'Comment about term.'

I am trying to make the ontology easily explorable using a Flask app (I'm using Python to parse the ontology file), and I can't seem to quickly get all of the annotations and sub-annotations.

I started using the owlready2 package but it requires you to self-define each individual annotation property (you can't just get a list of all of them, so if you add a property like random_identifier you have to go back into the code and add entity.random_identifier or it won't be picked up). This works okay, it's pretty fast, but subannotations require loading the IRI, then searching for it as:

random_prop = IRIS['http://schema.org/fillerName']
sub_annotation = x[entity, random_prop, annotation_label]

This is extremely slow, taking 5-10 minutes to load to search through around 140 sub-annotation types, compared to about 3-5 seconds for just the annotations.

From there I decided to scrap owlready2 and try rdflib. However, it looks like sub-annotations are just attached as BNodes and I can't figure out how to access them through their "parent" annotation or if that's even possible.

TL;DR: Does anybody know how to access an entry and gather all of its annotations and sub-annotations quickly in an XML/RDF ontology file?

EDIT 1:

As suggested, here is a snippet of the ontology:

    <!-- http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C42610 -->

    <owl:Class rdf:about="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C42610">
        <rdfs:subClassOf rdf:resource="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C42698"/>
        <obo:IAO_0000115 xml:lang="en">A shortened form of a word or phrase.</obo:IAO_0000115>
        <oboInOwl:hasDbXref rdf:datatype="http://www.w3.org/2001/XMLSchema#anyURI">https://en.wikipedia.org/wiki/Abbreviation</oboInOwl:hasDbXref>
        <rdfs:label xml:lang="en">abbreviation</rdfs:label>
        <schema:alternateName xml:lang="en">abbreviations</schema:alternateName>
        <Property:P1036 rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">411</Property:P1036>
    </owl:Class>
    <owl:Axiom>
        <owl:annotatedSource rdf:resource="http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#C42610"/>
        <owl:annotatedProperty rdf:resource="https://www.wikidata.org/wiki/Property:P1036"/>
        <owl:annotatedTarget rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">411</owl:annotatedTarget>
        <schema:bookEdition rdf:datatype="http://www.w3.org/2001/XMLSchema#integer">20</schema:bookEdition>
    </owl:Axiom>

Thank you all so much!

user3684314
  • 707
  • 11
  • 31
  • Can you add a snippet from the ontology generated by Protege showing annotation and subannotation? OWL defined annotations on IRIs or anonymous individuals, annotations on axioms and nested annotations, i.e., annotations on annotations, but nothing specific to annotations on annotation values (if the annotation value is an IRI or an anonymous individual, it can be annotated but that's simply a separate annotation axiom. Depending on what you're after, different APIs might have very different ways of accessing the data - and the SPARQL query would also differ. – Ignazio Nov 27 '19 at 20:08
  • @Ignazio I totally didn't even think of that, thanks so much! I added above. It looks like it annotates the source, the property, and the target directly after the class as an axiom. – user3684314 Dec 01 '19 at 21:28

3 Answers3

1

From your question I gather that the 'sub-annotation' level is only ever one deep. If that is the case, you could do a SPARQL query as follows:

SELECT ?annProp ?annValue ?subAnn ?subValue
WHERE { 
   ?annProp a owl:AnnotationProperty .
   <the:concept> ?annProp ?annValue . 
   OPTIONAL { ?annValue ?subAnn ?subValue . }
}

This will retrieve all annotation properties and their values for the given concept the:concept, and optionally, if that annotation has a "sub-annotation", it also retrieves that sub-annotation.

Jeen Broekstra
  • 21,642
  • 4
  • 51
  • 73
  • 1
    In OWL2 an entity can have two kind of annotations: annotation assertions and bulk annotations (b-node with `owl:Axiom` rdf-type), https://www.w3.org/TR/owl2-mapping-to-rdf/#Axioms_that_Generate_a_Main_Triple This answer handles only the first possibility. Also, it does not take into account sub-annotations of sub-annotations... In java there are APIs to work with annotations. Are there analogues for python ? – ssz Nov 27 '19 at 11:23
  • @ssz I assumed from the example in the question that the OP was dealing with annotation assertions only, and like I said in the answer as well: this indeed assumes there is only one level of sub-annotations, again because I had the impression that was the OPs case. But I'm doing some guessing here because there's not enough detail in the question. As for APIs in Python: I'm not sure. I personally haven't done much Semantic Web work in Python. – Jeen Broekstra Nov 28 '19 at 05:39
1

So I was overlooking the obvious... I updated owlready2 from 0.18 to 0.22 and it's lightning fast now.

user3684314
  • 707
  • 11
  • 31
0

"XPath expressions," which are a way of specifying a search into an XML structure, might be able to get the job done.

See:

How to use Xpath in Python?

https://docs.python.org/2/library/xml.etree.elementtree.html#xpath-support

If you have the data in an XML structure, XPath can probably walk through the tree (for you ...) and retrieve the nodes of interest.

Mike Robinson
  • 8,490
  • 5
  • 28
  • 41
  • 2
    This is likely to be a brittle, as an OWL ontology can be serialized in many different ways, and Protege can choose to re-order things whenever you edit the ontology. Typically, XML is not the right abstraction level to use when processing OWL ontologies. – Jeen Broekstra Nov 26 '19 at 20:58