Is there a SPARQL or other method to extract a single class and all of its associated axioms and annotations from an ontology? For example, assume one had a list of classes from one ontology that they wanted to add to another ontology.
For example, consider the following class from IDO: http://purl.obolibrary.org/obo/IDO_0000406 (see below). In this example, it might be easier to simply parse the contents of the ontology's OWL file using a regex pattern such as /(<owl:Class .+?\s\s\s\s\s\n<\/owl:Class>/
to capture all of the details of every class after which the classes of interest could simply be filtered out and appended to a new OWL file.
<owl:Class rdf:about="http://purl.obolibrary.org/obo/IDO_...">
...
</owl:Class>
...
<owl:Class rdf:about="http://purl.obolibrary.org/obo/IDO_0000406">
<rdfs:subClassOf rdf:resource="http://purl.obolibrary.org/obo/IDO_0000452"/>
<rdfs:subClassOf>
<owl:Class>
<owl:intersectionOf rdf:parseType="Collection">
<owl:Restriction>
<owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/RO_0000052"/>
<owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/OBI_0100026"/>
</owl:Restriction>
<owl:Restriction>
<owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000054"/>
<owl:allValuesFrom>
<owl:Class>
<owl:intersectionOf rdf:parseType="Collection">
<rdf:Description rdf:about="http://purl.obolibrary.org/obo/BFO_0000015"/>
<owl:Restriction>
<owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000051"/>
<owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/IDO_0000625"/>
</owl:Restriction>
<owl:Restriction>
<owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000051"/>
<owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/TRANS_0000000"/>
</owl:Restriction>
<owl:Restriction>
<owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000051"/>
<owl:someValuesFrom>
<owl:Class>
<owl:intersectionOf rdf:parseType="Collection">
<rdf:Description rdf:about="http://purl.obolibrary.org/obo/IDO_0000626"/>
<owl:Class>
<owl:complementOf>
<owl:Restriction>
<owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000066"/>
<owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/IDO_0000457"/>
</owl:Restriction>
</owl:complementOf>
</owl:Class>
</owl:intersectionOf>
</owl:Class>
</owl:someValuesFrom>
</owl:Restriction>
</owl:intersectionOf>
</owl:Class>
</owl:allValuesFrom>
</owl:Restriction>
</owl:intersectionOf>
</owl:Class>
</rdfs:subClassOf>
<obo:IAO_0000115 xml:lang="en">An infectious disposition to become part of a disorder only in organisms whose defenses are compromised.</obo:IAO_0000115>
<obo:IAO_0000117>Albert Goldfain</obo:IAO_0000117>
<obo:IAO_0000117>Alexander Diehl</obo:IAO_0000117>
<obo:IAO_0000117>Lindsay Cowell</obo:IAO_0000117>
<obo:IAO_0000118 xml:lang="en">opportunitistic pathogenic disposition</obo:IAO_0000118>
<rdfs:comment>The disposition is realized in a process by which the bearer becomes part of a disorder in an immunocompromised host.</rdfs:comment>
<rdfs:comment xml:lang="en">This includes individuals who are immunocompromised or who have damaged barriers that normally protect against infection (e.g. skin).</rdfs:comment>
<rdfs:label xml:lang="en">opportunistic infectious disposition</rdfs:label>
</owl:Class>
EDIT: I tried using rdflib to achieve this by loading the source ontology as a Graph()
. Using a for loop to iterate through statements in the loaded graph, it's easy enough to find the triples which are directly connected to a class, say, ido:IDO_0000406 -> obo:IAO_0000117 -> Alexander Diehl
. These can then be added to the target ontology, which is also loaded as a graph. However, using the IDO_0000406
class example, it is clear that triples not on the first sub-level (i.e. anything within the clause <rdfs:subClassOf><owl:Class>...
) will not come through as expected. For example:
g_tgt = Graph()
classes = ['http://purl.obolibrary.org/obo/IDO_0000406'
'http://purl.obolibrary.org/obo/IDO_0000407',
'http://purl.obolibrary.org/obo/IDO_0000408',
'http://purl.obolibrary.org/obo/IDO_0000409']
g_src = Graph()
g_src.parse('ido.owl')
for stmt in g_src: # stmt: (subject, predicate, object)
if str(stmt[0]) in classes:
# adds all first-level triples to target graph
g_tgt.add((stmt[0], stmt[1], stmt[2]))
My thought is to approach non-first-level nodes in a recursive fashion and will update if this is successful.
EDIT 2: It should be possible to extract the classes using ROBOT (http://robot.obolibrary.org/extract). For example:
robot extract --method STAR \
--input filtered.owl \
--term-file uberon_module.txt \
--output results/uberon_module.owl
where uberon_module.txt
would include the list of classes to be extracted.