3

How to validate xml (using libxml) file without specifying the schema file explicitly? xsd file is in the header of the xml file. The corresponding xsd file URL should be located in the local file system using a Catalog.xml.

chandras
  • 94
  • 6
  • See this question and answer for the validation, https://stackoverflow.com/questions/17819884/xml-xsd-feed-validation-against-a-schema/17819981#17819981 and this one for setting up a catalog for Linux (works the same way for xmllint and lxml, set an environment variable with the catalog location): https://stackoverflow.com/questions/11623369/how-to-set-up-catalog-files-for-xmllint – Thomas BDX Sep 20 '18 at 15:26

2 Answers2

1

Looks like it is not possible at the moment (libxml 2.8.0). This is taken from libxml page (xmlschemas):

interface to the XML Schemas handling and schema validity checking, it is incomplete right now.

As a workaround one may use a combined schema with lots of import elements. Superfluous namespaces may be specified. Finally the combined schema must be passed to the validator explicitly.

Namespaces imported with xsd:import are resolved correctly using catalogs, unless schemaLocation in import specifies valid direct location.

<import namespace="http://example.com"
          schemaLocation="example.xsd">

If example.xsd does not exist in current directory, it is resolved using catalog files.

Jarekczek
  • 7,456
  • 3
  • 46
  • 66
1

I know it's an old question however it's 2021 and some governments just woke up to the whole internet thing. Long story short they use XML (yeah I know).

So schema validation happens via xsd with a catalog and lxml didn't use it. At least on my 2021 Python 3.9 on Windows 10. Instead I found out that imports from within the files can be rewritten on the fly before loading

So what I did to fix it:

xmlschemadoc = etree.parse(xsd_file_with_imports)
for i in xmlschemadoc.findall(".//{http://www.w3.org/2001/XMLSchema}import"):
    i.attrib['schemaLocation'] = convert_namespace_to_xsd_file(i.attrib['namespace'])

Then you can use the schema:

xmlschema = etree.XMLSchema(xmlschemadoc)
xmlschema.assertValid(xml)
Ajay
  • 161
  • 11