0

In one of my projects I'm using cgi.escape() to escape a set of titles that I get from a resource. These titles could be from Youtube or anywhere else, and may need to be escaped.

The issue I'm having is that if a title is already escaped from Youtube and I pass it into cgi.escape(), I end up getting double-escaped titles, which is messing up later parts of my project.

Is there a library that will escape strings but check if a piece is already escaped, and ignore it?

Zach
  • 4,555
  • 9
  • 31
  • 52
  • https://wiki.python.org/moin/EscapingHtml – Padraic Cunningham Sep 21 '15 at 23:50
  • 1
    According to http://webhelpers2.readthedocs.org/en/latest/modules/html/builder.html, webhelpers2 has a literal class, while literal.escape() returns literal instances and if given a literal to escape returns it unchanged. –  Sep 21 '15 at 23:56
  • `import html; s = """>>"""; s = html.escape(html.unescape(s))` – Max Oct 16 '18 at 20:59

3 Answers3

1

webhelpers2.html.builder.literal represents an "HTML literal string, which will not be further escaped". It has an escape method for escaping HTML and returning a literal and a literal instance can be converted to a string with ''.join(literal_instance)

For example using Python 2.7.10:

from webhelpers2.html.builder import literal

e1 = literal.escape('& < >')
e1
Out[3]: literal(u'&amp; &lt; &gt;')

e2 = literal.escape(e1)
e2
Out[5]: literal(u'&amp; &lt; &gt;')

s = ''.join(e1)
s
Out[7]: u'&amp; &lt; &gt;'

With Python 3.4.3:

from webhelpers2.html.builder import literal

e1 = literal.escape('& < >')
e1
literal('&amp; &lt; &gt;')

e2 = literal.escape(e1)
e2
Out[5]: literal('&amp; &lt; &gt;')

s = ''.join(e1)
s
Out[7]: '&amp; &lt; &gt;'
0

If you know your input is already escaped, unescape it first. Then later escape it just before where it needs to be.

AcidReign
  • 504
  • 3
  • 6
-1

You can resolve the possibly-escaped strings first, then pass them to whatever escaping you're doing yourself.

Community
  • 1
  • 1
alksdjg
  • 1,019
  • 2
  • 10
  • 26