5

I am using Python xml.etree.ElementTree to output XML. I want to output it with entity references that will be substituted when the XML is parsed.

ordinarily '&' is escaped as & because '&' is used to declare entity references. However, I really do want to write an entity reference. For example, I want to write an XML file containing the entity reference &manifestName;:

>>> from xml.etree.ElementTree import Element, tostring
>>> manifest = Element('manifest')
>>> manifest.text = '&manifestName;'
>>> tostring(manifest)

Which returns an escaped ampersand:

'<manifest>&amp;manifestName;</manifest>'

The desired XML would be:

'<manifest>&manifestName;</manifest>'

I have tried various escaping tricks, like &#38;, \&, &&, but they do not work. The ampersands they contain are always rendered as &amp;.

  • See http://stackoverflow.com/questions/2360598/how-do-i-unescape-html-entities-in-a-string-in-python-3-1 –  Nov 03 '11 at 03:44

1 Answers1

1

I have decided to go with a relatively palatable hack. In the text, I use && to mean an escaped &. ElementTree converts this to &amp;&amp;. At the end, I simply do a string replacement on it:

>>> from xml.etree.ElementTree import Element, tostring
>>> manifest = Element('manifest')
>>> manifest.text = '&&manifestName;'
>>> tostring(manifest).replace('&amp;&amp;', '&')

The result is the entity reference I want:

'<manifest>&manifestName;</manifest>'
  • I am having trouble implementing your solution. Could you check this question? https://stackoverflow.com/questions/56892938/xml-line-break-character-entity-and-python-encoding – posfan12 Jul 06 '19 at 23:01