Python 2.7 interprets string literals as ASCII, not unicode, and so even though you've tried to include unicode characters in your argument to foo.replace
, replace
is just seeing ASCII {'\', 'u', '2', '0', '1', '9'}
. This is because Python doesn't assign a special meaning to "\u"
unless it is parsing a unicode literal.
To tell Python 2.7 that this is a unicode string, you have to prefix the string with a u
, as in foo.replace(u'\u2017', "'")
.
Additionally, in order to indicate the start of a unicode code, you need \u
, not \\u
- the latter indicates that you want an actual '\'
in the string followed by a 'u'
.
Finally, note that foo
will not change as a result of calling replace
. Instead, replace
will return a value which you must assign to a new variable, like this:
bar = foo.replace(u'\u2017', "'")
print bar
(see stackoverflow.com/q/26943256/4909087)