0

I'm generating some HTML using ElementTree in a Python script. The relevant part looks like this:

separator = ET.SubElement(toc, 'span')
separator.set('class', 'separator')
separator.text = u" ⬩"

The problem is that the HTML entity   is being encoded in the resulting XML output as  , which puts the literal text   on the rendered web page, instead of inserting a non-breaking space.

I'm working around it by escaping a non-breaking space into the Python string like this: u"\u00A0⬩". This puts a literal non-breaking space in my HTML instead of the entity. Ultimately, it works because it is faithfully rendered as a non-breaking space, but it makes the source code hard to read because a non-breaking space looks like a regular space.

How can I get ElementTree to insert an HTML entity into my source code?

Zev Eisenberg
  • 8,080
  • 5
  • 38
  • 82
  • 1
    XML doesn't support entities except for quote, apostrophe, ampersand, and angle brackets. The rest are an HTML extension. You may need to find another tool. – Tim Roberts May 15 '21 at 03:54
  • Not sure what can be done about this. All `&` characters are replaced by `&` during serialization (https://github.com/python/cpython/blob/main/Lib/xml/etree/ElementTree.py#L2040). See also https://stackoverflow.com/q/9276848/407651. lxml behaves in a similar way: https://stackoverflow.com/q/67299346/407651. – mzjn May 15 '21 at 07:40
  • Another similar question: https://stackoverflow.com/q/7986272/407651 – mzjn May 15 '21 at 09:45

0 Answers0