How to fix a mixed encoding unicode object?

Asked Jan 03 '15 at 12:58

Active Jan 03 '15 at 13:13

Viewed 54 times

I have this in python and it is driving me nuts. How can I fix (normalize) the encoding in this string:

u"wie wir sie lieben **\u2013** unkompliziert und am Puls der Zeit**\u201c**\nElegant, casual oder maritim, unsere Alltags-Looks sind so **viel\xe4fltig** wie unser Leben"

Martijn marked it as a duplicate question: I need to ask once more. The above string contains both: \u2013 and \xe4f Doesn't this indicate a mixed encoding. If yes, how can it be fixed? Or am I misunderstanding something?

edited Jan 03 '15 at 13:13

asked Jan 03 '15 at 12:58

Jabb

3,414
8
35
58

If I do `print()` on that string, Python can print it. So what exactly do you wish to change? – Simeon Visser Jan 03 '15 at 13:00
1

Don't confuse the `repr()` output with the actual contents. You already have the right value, there is nothing to fix here. – Martijn Pieters Jan 03 '15 at 13:01
doesn't \u2013 and \xe4f in one string indicate a mixed encoding? – Jabb Jan 03 '15 at 13:02
That is just a Unicode string. You have code points U+2013 and U+00E4 in it. Since U+00E4 is less than 256, Python shows it as `\xe4` (the f is the next character after the `\xe4`). – Ned Batchelder Jan 03 '15 at 13:16

How to fix a mixed encoding unicode object?

0 Answers0