I have a file that contains a unicode string: u"L'\xe9quipe le quotidien"
I have another file, exported from Windows and encoded as iso-8859-1
with the same string: "L'<E9>quipe le quotidien"
(this is a copy/paste from less
in my shell).
Converting the content of the Windows file with decode('iso-8859-1').encode('utf8')
results in a string that is different from the one in the Windows file: L'équipe le quotidien
.
What is the best way to do this comparison? I seem to be unable to convert the latin1 string into utf-8.