python: convert to HTML special characters

Question

Possible Duplicate:
Replace html entities with the corresponding utf-8 characters in Python 2.6
What's the easiest way to escape HTML in Python?

There is a way to easily convert a string to a HTML string, e.g. with chars like <, > replaced by < > or will I have to write my own conversion routine???

see http://docs.python.org/library/htmllib.html#module-htmlentitydefs — Ashwini Chaudhary, Jun 12 '12 at 09:22
I think what you need is called "HTML escaping". This is why you didn't find the answer by yourself. [Here is a Stackoverflow answer.](http://stackoverflow.com/questions/1061697/whats-the-easiest-way-to-escape-html-in-python) — anonymous, Jun 12 '12 at 09:24

score 12 · Accepted Answer · answered Jun 12 '12 at 09:23

12

If you're only concerned about critical special characters like &, < and >:

>>> import cgi
>>> cgi.escape("<hello&goodbye>")
'&lt;hello&amp;goodbye&gt;'

For other non-ASCII characters:

>>> "Übeltäter".encode("ascii", "xmlcharrefreplace")
b'&#220;belt&#228;ter'

Of course, if necessary, you can combine the two:

>>> cgi.escape("<Übeltäter>").encode("ascii", "xmlcharrefreplace")
b'&lt;&#220;belt&#228;ter&gt;'

answered Jun 12 '12 at 09:23

Tim Pietzcker

328,213
58
503
561

1

`>>> "Übeltäter".encode("ascii", "xmlcharrefreplace")` results in `UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)` – brandones Jun 01 '17 at 19:49
3

`cgi.escape()` is now deprecated. Use `html.escape()` instead - check [this answer](https://stackoverflow.com/a/5072031/738017) – Vito Gentile Sep 27 '21 at 14:38

python: convert to HTML special characters

1 Answers1