0

I've been validating xml files against their associated schemas with lxml but I found out that it does not (or appears not to) catch the situation where you have an IDREF type without the matching ID type in the document.
For example:
Attribute "internalRefId" is of type "IDREF" and attribute "id" is of type "ID". In the schema:

<xs:attribute name="internalRefId" type="xs:IDREF"/>
<xs:attribute name="id" type="xs:ID"/>

When used in xml file:

....See <internalRef internalRefId="T0003" internalRefTargetType="irtt02"/>....

There should be a corresponding element tag with attribute ID="T0003" such as

<table id="T0003">

This is a problem if the there is no id="T0003" anywhere in the xml document and from my understanding that is the purpose of specifying attributes as IDREF/ID types. And therefore should be caught by an xml validator.
But I can't figure out how to get lxml validator to do this. If I remove, for example, the above table element, lxml validation does not catch this and considers it still valid. My validation code is similar to the following:

from lxml import etree as ET

tree = ET.parse(filePath)
schema = ET.XMLSchema(ET.parse(somePathtoSchemaXsdFile))

isValid = schema.validate(tree)
  1. If lxml does verify IDREF/ID matches, then what do I need to do?
  2. If lxml does not handle this, the reason why would be nice to know, but more importantly what options do I have? The environment is primarily python based and free is preferred but other solutions are on the table.
mzjn
  • 48,958
  • 13
  • 128
  • 248
beakerchi
  • 107
  • 3
  • 8

0 Answers0