0

I wrote a Python script allowing to parse a csv file into a xml file following a xsd schema file. The output xml file gets validated against the xsd schema using lxml.etree's XMLSchema method.

I am currently handling missing values by replacing them with a dummy number (-999), as detailed in this question, but I was recently provided with a new xsd file allowing nillable elements:

<xs:element name="ProfileValue" type="xs:double" nillable="true"/>

1. Following this SO post, I tried using xml_data.append('<ProfileValue xsi:nil="true"/>') to pass on "missing" values to the xml file, but this leads to the following error:

Namespace prefix xsi for nil on ProfileValue is not defined

2. After reading that xsi and xs prefixes were interchangeable, I also tried xml_data.append('<ProfileValue xs:nil="true"/>'), but sure enough, this also returns a XMLSyntaxError:

Namespace prefix xs for nil on ProfileValue is not defined

3. I tried omitting the prefix (xml_data.append('<ProfileValue nil="true"/>')) , but this then leads to this error:

Element 'ProfileValue': '' is not a valid value of the atomic type 'xs:double'.

4. I aslo tried xml_data.append('<ProfileValue nil="true"> </ProfileValue>') following this SO post, but also got an error:

Element 'ProfileValue': ' ' is not a valid value of the atomic type 'xs:double'.

What am I doing wrong here? Is it even possible to set double type values as nillable?

Sheldon
  • 4,084
  • 3
  • 20
  • 41

1 Answers1

2

Thanks for a great problem description, and thanks for explaining all of the things that you already tried.

I think you have two problems

  1. You are using the namespace prefix 'xsi' but you are not declaring that namespace. That's like using a variable in C/Java without declaring it first. You need to include an xmlns attribute (or 'namespace declaration') in your output XML. Like this: xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance". It can be included on every tag that uses xsi:nil but it is often declared once on the root tag, thus avoiding multiple declarations of the xsi namespace.

  2. You are constructing your output XML using string manipulation. That can work, but only if you stay lucky. Since you obviously have access to the lxml library, I strongly suggest that you allow lxml to do all of the XML syntax for you (see LXML namespaces). If you continue on your current path you will need to deal with XML escaping rules for reserved XML characters, illegal XML characters etc.

kimbert
  • 2,376
  • 1
  • 10
  • 20
  • Good answer, +1 from my side! – Yitzhak Khabinsky Jul 11 '21 at 01:47
  • Thanks for your detailed answer, kimbert. 1. Declaring the xsi namespace in the root did the trick for me. 2. Agreed.After reading this post (https://stackoverflow.com/questions/3034611/whats-so-bad-about-building-xml-with-string-concatenation), I am now convinced that I should no use string concatenation to build xml content. Hopefully, the LXML library will help me build an alternate solution. – Sheldon Jul 11 '21 at 08:18