0

I am trying to format the following xml

<block formula="MY_VAR < 3"><set-variable name="OTHER_VAR"></set-variable></block>

into

<block formula="MY_VAR < 3">
  <set-variable name="OTHER_VAR">
  </set-variable>
</block>

using xmlformatter and getting an error because of the < in my formula. Specifically the error is

ExpatError: not well-formed (invalid token)

when I try the code

my_xml = '<block formula="MY_VAR < 3"><set-variable name="OTHER_VAR"></set-variable></block>'
formatter = xmlformatter.Formatter(indent="1", indent_char="  ", encoding_output="UTF-8", preserve=["literal"])
pretty_xml = formatter.format_string(my_xml)

How do I include the less than in my formula and be able to format my XML?

wogsland
  • 9,106
  • 19
  • 57
  • 93
  • Similar https://stackoverflow.com/questions/29398950/is-there-a-way-to-include-greater-than-or-less-than-signs-in-an-xml-file – snakecharmerb Mar 20 '19 at 10:26

1 Answers1

1

You can use xml.sax.saxutils.quoteattr to escape the attribute value when constructing the xml string.

>>> my_xml = '<block formula=%s><set-variable name="OTHER_VAR"></set-variable></block>' % su.quoteattr('MY_VAR < 3')
>>> my_xml
'<block formula="MY_VAR &lt; 3"><set-variable name="OTHER_VAR"></set-variable></block>'

If you don't control construction of the xml this hack will fix the xml in the example:

stack = []

out = []
brackets = '<>'

for c in bad_xml:
    if c in brackets:
        try:
            prev = stack[-1]
        except IndexError:
            stack.append(c)
            out.append(c)
        else:
            if prev == c:
                escaped = '&gt;' if c == '>' else '&lt;'
                out.append(escaped)
            else:
                stack.append(c)
                out.append(c)
    else:
        out.append(c)
my_xml = ''.join(out)
snakecharmerb
  • 47,570
  • 11
  • 100
  • 153