0

I extract unicode text from a website and do some printing that can contain non-ascii characters. Printing those raises a UnicodeEncodeError. How should I handle this case? I need a solution that requires minimum changes to the existing code.

I tried encoding the strings (by wrapping sys.stdout) with UTF-8, but then I get a TypeError since the original sys.stdout.write() method expects str not bytes.

I don't want the characters to be lost. The user may want to pipe the output into a file that would in this case be UTF8 formatted.


Edit: Setting PYTHONIOENCODING=utf8 can fix the problem, but isn't there a way I can "imitate" this behaviour in Python? Since utf8 doesn't match the Windows console encoding, some characters look strange. But this is by far better than a crash of the program.

Anyway to format the output from Python so I don't need to modify environment variables?

Niklas R
  • 16,299
  • 28
  • 108
  • 203

0 Answers0