0

I'm trying to validate an XML file with an XSD file, which doesn't work and I don't know why.

I figured out that we could do it on terminal with that example :

xmllint --noout --schema owl2-xml.xsd camera.owl

But it produces an error, which I particularly don't understand.

regexp error : failed to compile: expecting a branch after |
owl2-xml.xsd:30: element pattern: Schemas parser error : Element '{http://www.w3.org/2001/XMLSchema}pattern': The value '([A-Z]|[a-z]|[À-Ö]|[Ø-ö]|[ø-˿]|[Ͱ-ͽ]|[Ϳ-῿]|[-]|[⁰-↏]|[Ⰰ-⿯]|[、-퟿]|[豈-﷏]|[ﷰ-�]|[-])(([A-Z]|[a-z]|[À-Ö]|[Ø-ö]|[ø-˿]|[Ͱ-ͽ]|[Ϳ-῿]|[-]|[⁰-↏]|[Ⰰ-⿯]|[、-퟿]|[豈-﷏]|[ﷰ-�]|[-]|_|\-|[0-9]|·|[̀-ͯ]|[‿-⁀]|\.)*([A-Z]|[a-z]|[À-Ö]|[Ø-ö]|[ø-˿]|[Ͱ-ͽ]|[Ϳ-῿]|[-]|[⁰-↏]|[Ⰰ-⿯]|[、-퟿]|[豈-﷏]|[ﷰ-�]|[-]|_|\-|[0-9]|·|[̀-ͯ]|[‿-⁀]  ))?|' of the facet 'pattern' is not a valid regular expression.
WXS schema owl2-xml.xsd failed to compile

But if I choose a validator xml file (this one : http://mowl-power.cs.man.ac.uk:8080/validator/)

My XML file is validated !

I don't understand why, this isn't working ... When the XML Schema I'm choosing (should be) the same as the validator link.

The XML schema is from there : http://www.w3.org/2009/09/owl2-xml.xsd (owl2) And the validator also use the owl2 structure. So... What am i missing ?

Example Owl File

This is the example which i'm using and trying to validate camera.owl

C. M. Sperberg-McQueen
  • 24,596
  • 5
  • 38
  • 65
Damiii
  • 1,363
  • 4
  • 25
  • 46

4 Answers4

4

There are a number of ways that you can serialize an OWL ontology. One of them is to serialize it as RDF. RDF can also be serialized in a number of different formats, one of which is RDF/XML. Many of the files that you see online with a .owl extension are the RDF/XML serialization of the RDF representation of an OWL ontology. There's going to be lots of variation in the possibilities there, because the same RDF graph can be serialized in many different ways in the RDF/XML serialization. See my answer to How to access OWL documents using XPath in Java? for more about that issue.

Another way to serialize OWL ontologies is using the OWL/XML serialization, which is also XML based, but is not an RDF-based serialization. I'm assuming that you got the XSD file that you're using from 3.4 The XML Schema from OWL 2 Web Ontology Language XML Serialization (Second Edition). That serialization is a direct serialization of an OWL ontology in XML that doesn't take the OWL → RDF → RDF/XML route. That is, the XSD is for the OWL/XML format, not for RDF/XML.

So, I suspect that what's happening, regardless of whether or not your validator is handling the XSD correctly, is that you're attempting to validate an RDF/XML file using an XSD for OWL/XML. You didn't show any of the content of the OWL file that you're trying to validate though, so we can't be sure.

As a very simple example, here's a small OWL ontology in the OWL/XML serialization, generated though Protégé. This is what you get if you save the ontology using the OWL/XML format:

<?xml version="1.0"?>
<!DOCTYPE Ontology [
    <!ENTITY xsd "http://www.w3.org/2001/XMLSchema#" >
    <!ENTITY xml "http://www.w3.org/XML/1998/namespace" >
    <!ENTITY rdfs "http://www.w3.org/2000/01/rdf-schema#" >
    <!ENTITY rdf "http://www.w3.org/1999/02/22-rdf-syntax-ns#" >
]>
<Ontology xmlns="http://www.w3.org/2002/07/owl#"
     xml:base="https://stackoverflow.com/q/23984040/1281433/example"
     xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
     xmlns:xml="http://www.w3.org/XML/1998/namespace"
     ontologyIRI="https://stackoverflow.com/q/23984040/1281433/example">
    <Prefix name="rdf" IRI="http://www.w3.org/1999/02/22-rdf-syntax-ns#"/>
    <Prefix name="rdfs" IRI="http://www.w3.org/2000/01/rdf-schema#"/>
    <Prefix name="xsd" IRI="http://www.w3.org/2001/XMLSchema#"/>
    <Prefix name="owl" IRI="http://www.w3.org/2002/07/owl#"/>
    <Declaration>
        <Class IRI="#Person"/>
    </Declaration>
    <Declaration>
        <NamedIndividual IRI="#RichardNixon"/>
    </Declaration>
    <ClassAssertion>
        <Class IRI="#Person"/>
        <NamedIndividual IRI="#RichardNixon"/>
    </ClassAssertion>
    <AnnotationAssertion>
        <AnnotationProperty abbreviatedIRI="rdfs:label"/>
        <IRI>#RichardNixon</IRI>
        <Literal xml:lang="en" datatypeIRI="&rdf;PlainLiteral">Richard Nixon</Literal>
    </AnnotationAssertion>
</Ontology>
<!-- Generated by the OWL API (version 3.2.5.1912) http://owlapi.sourceforge.net -->

If you save the same ontology as RDF/XML, you get this:

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:owl="http://www.w3.org/2002/07/owl#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
    xmlns="https://stackoverflow.com/q/23984040/1281433/example#"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema#">
  <owl:Ontology rdf:about="https://stackoverflow.com/q/23984040/1281433/example"/>
  <owl:Class rdf:about="https://stackoverflow.com/q/23984040/1281433/example#Person"/>
  <owl:NamedIndividual rdf:about="https://stackoverflow.com/q/23984040/1281433/example#RichardNixon">
    <rdf:type rdf:resource="https://stackoverflow.com/q/23984040/1281433/example#Person"/>
    <rdfs:label xml:lang="en">Richard Nixon</rdfs:label>
  </owl:NamedIndividual>
</rdf:RDF>

They're both XML-based serializations of the ontology, but they're not the same, and only the OWL/XML representation would be validated by the XSD that you're using. Both could be validated using an OWL validator, though, because they're both legitimate serializations of an OWL ontology.

Community
  • 1
  • 1
Joshua Taylor
  • 84,998
  • 9
  • 154
  • 353
  • And how could i validate an RDF/XML ? thank you for all the answers you brings us – Damiii Jun 02 '14 at 21:09
  • An example which i'm trying to validate is the one which i edited right now my post ! – Damiii Jun 02 '14 at 21:11
  • @Damiii Yes, and your camera.owl is a RDF/XML serialization of the RDF representation of the OWL ontology. That means that the OWL2 XSD you have does not apply. What is it that you want to validate about it? That's it's legal XML? That it's legal RDF/XML? That it's a legal (RDF/XML serialization of an RDF representation of an) OWL ontology? Why not use an OWL validator like you mentioned in the question? – Joshua Taylor Jun 02 '14 at 21:19
  • i want to see , if my owl file is sintatically correct. And i'm doing a project which parse an owl file. And i would like to see if my owl file is correct or not (at least i wanna do as the validator link). But i'm doing my project on haskell so... how will i able to do that in haskell – Damiii Jun 02 '14 at 22:16
  • I think that what doesn't seem to be coming across is that there are several syntaxes hear that you're dealing with. If you want to use the XSD that you mentioned, then you need to write your ontology in a different syntax. – Joshua Taylor Jun 02 '14 at 22:25
  • And if i wanted , to not use my XSD and use something like the validator link ? – Damiii Jun 02 '14 at 22:37
  • If you want to use the validator link, then you can continue to use your current syntax (RDF/XML) and continue using the validator link. – Joshua Taylor Jun 02 '14 at 23:19
  • And how may i able to do that on a program ? – Damiii Jun 02 '14 at 23:49
  • Beside, i was seeing again the XML schema and the tag are almost like the RDFS/XML files, that's why i don't understand why i can't use for my camera.owl – Damiii Jun 02 '14 at 23:50
  • Yup there are some similar names there, but it's not the same syntax. Both are XML-based syntaxes, but they're not the same. That's all there is to it, I'm afraid. As to using the validator from a program, it's just a matter of calling that webservice with the appropriate content. That's really a different question, though. – Joshua Taylor Jun 03 '14 at 00:04
  • And this is my only option ? :( – Damiii Jun 03 '14 at 10:19
  • The issue here is that at the moment you've got **four** different things to check: (i) whether your OWL ontology, as an OWL ontology, is a legal OWL ontology; (ii) whether the RDF translation of your OWL ontology is a legal RDF document; (iii) whether the RDF translation of your OWL ontology, assuming that's it's legal RDF, is a correct translation of your OWL ontology; and (iv) whether the RDF/XML serialization of your RDF document is a correct serialization of your RDF document. The best option, in my opinion, is to use a ontology format that eliminates the need for (ii), (iii), and (iv). – Joshua Taylor Jun 03 '14 at 11:42
  • Using a format like OWL/XML that skips the RDF translation and the RDF/XML serialization would seem to be the easiest way to do that. The XSD will work with the OWL/XML serialization, so you can use the XSD to check (i). – Joshua Taylor Jun 03 '14 at 11:42
  • Meh, then i won't do anything ... Ontology is too liberal and not restrictive... Which don't let us to do anything at this point . Thank you for all the replies. – Damiii Jun 03 '14 at 14:02
  • What do you mean that it's too liberal and not restrictive? The specification of an OWL ontology is very definite. There are just lots of ways to *serialize* them. You just need to pick a serialization to use, then read your serialization and validate the ontology as an ontology, and not worry so much about its serialization. – Joshua Taylor Jun 03 '14 at 14:05
  • Basically it is as you said it " lots of ways to serialize". – Damiii Jun 03 '14 at 14:15
  • And where could i get a serialization for the camera.owl ? I mean, if i must do it, i don't have time to do it... That's why i was expecting an XML Schema or something else which will validate my camera.owl sintatically. – Damiii Jun 03 '14 at 14:16
  • 2
    Again, "validate my camera.owl sintatically" isn't specific enough; there are at least two different syntaxes here. It's like taking a recipe, then writing it in German, then translating it to English. You can ask "is the English recipe correct?" There are few possibilities: "Is the English a correct translation of the German?" "Is the German a correct record of the recipe?" and "Is the recipe correct itself?" The ontology is the recipe. The German is the RDF translation of the OWL. The English is the RDF/XML serialization of the RDF. You can check whether camera.owl is legal RDF/XML – Joshua Taylor Jun 03 '14 at 14:28
  • 1
    with some RDF/XML validator (maybe there's an XSD out there, I don't know). You can check whether it's a legal OWL ontology with the online OWL validator. You could use the OWLAPI to read camera.owl and then write it out as OWL/XML and then use the OWL XML XSD to check that. – Joshua Taylor Jun 03 '14 at 14:29
2

The validator on mowl-power validates a file as an owl 2 ontology, not as XML. DTD and xsd resolution is usually switched off for the OWLAPI parsers it uses, I believe.

Ignazio
  • 10,504
  • 1
  • 14
  • 25
  • Humm What you mean by "validates a file as an owl 2 ontology not as XML" i'm a little confused. – Damiii Jun 02 '14 at 08:39
  • I mean, if i wanna see if it is well structured my xml file... How am i going to do that without the xsd file ? – Damiii Jun 02 '14 at 08:40
  • The OWL 2 validation rules are not XML level rules, they apply to ontologies written in XML or otherwise. Your approach to validating the XML is correct, but it's a validation different from the one performed by the site you linked. – Ignazio Jun 02 '14 at 12:56
  • 2
    @Joshua's answer is very complete in describing the situation – Ignazio Jun 02 '14 at 18:09
1

xmllint's regular-expression parser appears to be in error. As the error message makes clear, it's expecting that the branch separator | will be followed by some non-empty branch; the XSD spec, however, is clear that the empty string counts as a branch. If you want to validate your XML against this XSD schema, you will need to use a validator with a more reliable implementation of XSD.

C. M. Sperberg-McQueen
  • 24,596
  • 5
  • 38
  • 65
-1

You can use http://pythonhosted.org/Owlready/ to read and parse owl file before your code