Python search through file and un-escape unicode characters

Asked Nov 16 '20 at 19:38

Active Nov 16 '20 at 19:38

Viewed 29 times

I have a file that has random unicode characters in it and I would like to un-escape them. For example, \uE0001 would become uE0001 and \uFEFF would become uFEFF. So far, I have:

with open(path, encoding="utf-8") as f:
    s = f.read()
    s = s.replace("\\u", "u")
with open(fpath, "w"):
    f.write(s)

But that gives the error:

UnicodeEncodeError: 'charmap' codec can't encode character '\ufeff' in position 0: character maps to <undefined>

Either I did something wrong when replacing so there are still unicode characters, or python is still trying to encode it. What did I do wrong here, and how can I get a working program?

asked Nov 16 '20 at 19:38

Beckett O'Brien

Show the file. Are you sure it's not JSON serialized? – wim Nov 16 '20 at 19:42
Did you try `with open(path, encoding="utf-8-sig") as f:`? – JosefZ Nov 16 '20 at 19:45
@JosefZ `utf-8-sig` seemed to work! Thank you! – Beckett O'Brien Nov 16 '20 at 20:06

Python search through file and un-escape unicode characters

0 Answers0