I extract unicode text from a website and do some printing that can contain non-ascii characters. Printing those raises a UnicodeEncodeError
. How should I handle this case? I need a solution that requires minimum changes to the existing code.
I tried encoding the strings (by wrapping sys.stdout
) with UTF-8, but then I get a TypeError
since the original sys.stdout.write()
method expects str
not bytes
.
I don't want the characters to be lost. The user may want to pipe the output into a file that would in this case be UTF8 formatted.
Edit: Setting PYTHONIOENCODING=utf8
can fix the problem, but isn't there a way I can "imitate" this behaviour in Python? Since utf8 doesn't match the Windows console encoding, some characters look strange. But this is by far better than a crash of the program.
Anyway to format the output from Python so I don't need to modify environment variables?