Handling print() UnicodeEncodeError

Asked Jun 16 '14 at 19:07

Active Jun 18 '14 at 13:57

Viewed 139 times

I extract unicode text from a website and do some printing that can contain non-ascii characters. Printing those raises a UnicodeEncodeError. How should I handle this case? I need a solution that requires minimum changes to the existing code.

I tried encoding the strings (by wrapping sys.stdout) with UTF-8, but then I get a TypeError since the original sys.stdout.write() method expects str not bytes.

I don't want the characters to be lost. The user may want to pipe the output into a file that would in this case be UTF8 formatted.

Edit: Setting PYTHONIOENCODING=utf8 can fix the problem, but isn't there a way I can "imitate" this behaviour in Python? Since utf8 doesn't match the Windows console encoding, some characters look strange. But this is by far better than a crash of the program.

Anyway to format the output from Python so I don't need to modify environment variables?

edited Jun 18 '14 at 13:57

asked Jun 16 '14 at 19:07

Niklas R

16,299
28
108
203

See http://stackoverflow.com/questions/2276200/changing-default-encoding-of-python – Mark Ransom Jun 16 '14 at 19:12
How can I make `reload(sys).setdefaultencoding` work in Python 3? – Niklas R Jun 16 '14 at 19:47
Ok wait, using `export PYTHONIOENCODING="utf8"` works. Still feels clunky... I can "control" that the program raises an Exception from an Environment Variable. – Niklas R Jun 16 '14 at 19:49

Handling print() UnicodeEncodeError

0 Answers0