7

i have a <img src=__string__> but string might contain ", what should I do to escape it?

Example:

__string__ = test".jpg
<img src="test".jpg">

doesn't work.

SilentGhost
  • 307,395
  • 66
  • 306
  • 293
Timmy
  • 12,468
  • 20
  • 77
  • 107
  • This question http://stackoverflow.com/questions/275174/how-do-i-perform-html-decoding-encoding-using-python-django has some useful answers. – hwiechers Jun 22 '10 at 20:56

5 Answers5

15

In Python 3.2 a new html module was introduced, which is used for escaping reserved characters from HTML markup.

It has one function html.escape(s, quote=True). If the optional flag quote is true, the characters (") and (') are also translated.

Usage:

>>> import html
>>> html.escape('x > 2 && x < 7')
'x &gt; 2 &amp;&amp; x &lt; 7'
Maciej Ziarko
  • 11,494
  • 13
  • 48
  • 69
13

If your value being escaped might contain quotes, the best thing is to use the quoteattr method: http://docs.python.org/library/xml.sax.utils.html#module-xml.sax.saxutils

This is referenced right beneath the docs on the cgi.escape() method.

gomad
  • 1,029
  • 7
  • 16
  • 2
    +1, quoteattr is **exactly** the right function to use for this (and the online Python docs are pretty clear about this, too!). – Alex Martelli Jun 23 '10 at 00:36
  • That's cool. But worth noting if your string contains both single and double quotes, you'll get a URL with `"` in it, which is not likely to resolve to the resource you are targeting. – tcarobruce Jun 23 '10 at 01:00
  • 2
    This function is insufficient. I was able to inject HTML this way. `django.utils.html.escape` worked, though. – 2rs2ts Nov 15 '13 at 21:24
4
import cgi
s = cgi.escape('test".jpg', True)

http://docs.python.org/library/cgi.html#cgi.escape

Note that the True flag tells it to escape double quotes. If you need to escape single quotes as well (if you're one of those rare individuals who use single quotes to surround html attributes) read the note in that documentation link about xml.sax.saxutils.quoteattr(). The latter does both kinds of quotes, though it is about three times as slow:

>>> timeit.Timer( "escape('asdf\"asef', True)", "from cgi import escape").timeit()
1.2772219181060791
>>> timeit.Timer( "quoteattr('asdf\"asef')", "from xml.sax.saxutils import quoteattr").timeit()
3.9785079956054688
ʇsәɹoɈ
  • 22,757
  • 7
  • 55
  • 61
  • 3
    cgi.escape does not escape single quotes. For this reason it is dangerous to use it for HTML escaping, because the attribute the variable is being put into may be single quoted. If the attribute is single quoted, a cross-site scripting vulnerability could easily be found. – Craig Younkins Jun 24 '10 at 02:41
  • 1
    I explicitly mentioned the single quote issue in my answer. – ʇsәɹoɈ Jun 24 '10 at 03:41
2

If the URL you're using (as an img src here) might contain quotes, you should use URL quoting.

For python, use the urllib.quote method before passing the URL string to your template:

img_url = 'test".jpg'
__string__ = urllib.quote(img_url)
tcarobruce
  • 3,773
  • 21
  • 33
  • thanks, but if its not url or unicode, it fails for title attribute – Timmy Jun 22 '10 at 21:05
  • @Timmy, what do you mean by "it fails for title attribute"? The call to urllib.quote returns "test%22.jpg", which I believe is what you want. – Nikhil Jun 22 '10 at 22:05
-3

The best way to escape XML or HTML in python is probably with triple quotes. Note that you can also escape carriage returns.

"""<foo bar="1" baz="2" bat="3">
<ack/>
</foo>
"""
eeeeaaii
  • 3,372
  • 5
  • 30
  • 36