20

Is there a similar or equivalent function in Python to the PHP function htmlspecialchars()? The closest thing I've found so far is htmlentitydefs.entitydefs().

animuson
  • 53,861
  • 28
  • 137
  • 147
Ian
  • 24,116
  • 22
  • 58
  • 96

8 Answers8

12

Closest thing I know about is cgi.escape.

Karl Guertin
  • 4,346
  • 2
  • 22
  • 19
7
from django.utils.html import escape
print escape('<div class="q">Q & A</div>')
Agonych
  • 330
  • 4
  • 4
  • 1
    I'm voting for this because I don't want to parse anything like some of the other answers, or even do a search and replace, I want a single function that does it all for me. – paulmorriss Jun 18 '10 at 15:36
5

Building on @garlon4 answer, you can define your own htmlspecialchars(s):

def htmlspecialchars(text):
    return (
        text.replace("&", "&amp;").
        replace('"', "&quot;").
        replace("<", "&lt;").
        replace(">", "&gt;")
    )
AlejandroVD
  • 1,576
  • 19
  • 22
  • I think python has a fancy function named something like "translate" that you could use to make this even shorter – Brian Peterson Feb 07 '20 at 06:25
  • Too lazy right now but yeah: https://www.programiz.com/python-programming/methods/string/translate – Brian Peterson Feb 07 '20 at 06:27
  • Helpful answer, however you're passing the parameters to replace() in the wrong order. Should be: replace("string to find", "string to replace") – Ben Apr 25 '21 at 08:06
  • @Ben no, the function works as expected (it escapes the "html special chars"). It looks for the char to escape, and replaces it by the html escape sequence for that char. Maybe you wanted to un-escape instead? – AlejandroVD Apr 26 '21 at 20:01
  • My mistake! @AlejandroVD you are spot on. – Ben Aug 15 '21 at 15:31
3

I think the simplest way is just to use replace:

text.replace("&", "&amp;").replace('"', "&quot;").replace("<", "&lt;").replace(">", "&gt;")

PHP only escapes those four entities with htmlspecialchars. Note that if you have ENT_QUOTES set in PHP, you need to replace quotes with &#039; rather than &quot;.

garlon4
  • 1,162
  • 10
  • 14
3

You probably want xml.sax.saxutils.escape:

from xml.sax.saxutils import escape
escape(unsafe, {'"':'&quot;'}) # ENT_COMPAT
escape(unsafe, {'"':'&quot;', '\'':'&#039;'}) # ENT_QUOTES
escape(unsafe) # ENT_NOQUOTES

Have a look at xml.sax.saxutils.quoteattr, it might be more useful for you

Nicolas Dumazet
  • 7,147
  • 27
  • 36
1

Only five characters need to be escaped, so you can use a simple one-line function:

def htmlspecialchars(content):
    return content.replace("&", "&amp;").replace('"', "&quot;").replace("'", "&#039;").replace("<", "&lt;").replace(">", "&gt;")
Pikamander2
  • 7,332
  • 3
  • 48
  • 69
1

The html.entities module (htmlentitydefs for python 2.x) contains a dictionary codepoint2name which should do what you need.

>>> import html.entities
>>> html.entities.codepoint2name[ord("&")]
'amp'
>>> html.entities.codepoint2name[ord('"')]
'quot'
sykora
  • 96,888
  • 11
  • 64
  • 71
-1

If you are using django 1.0 then your template variables will already be encoded and ready for display. You also use the safe operator {{ var|safe }} if you don't want it globally turned on.

Paul Tarjan
  • 48,968
  • 59
  • 172
  • 213