I'm loading a file with a bunch of unicode characters (e.g. \xe9\x87\x8b
). I want to convert these characters to their escaped-unicode form (\u91cb
) in Python. I've found a couple of similar questions here on StackOverflow including this one Evaluate UTF-8 literal escape sequences in a string in Python3, which does almost exactly what I want, but I can't work out how to save the data.
For example: Input file:
\xe9\x87\x8b
Python Script
file = open("input.txt", "r")
text = file.read()
file.close()
encoded = text.encode().decode('unicode-escape').encode('latin1').decode('utf-8')
file = open("output.txt", "w")
file.write(encoded) # fails with a unicode exception
file.close()
Output File (That I would like):
\u91cb