4

What's the best way to do proper output of non-ascii characters that's Python 2 and 3 compatible? Is it this?

# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from __future__ import print_function
print("ȧƈƈḗƞŧḗḓ uʍop-ǝpısdn ŧḗẋŧ ƒǿř ŧḗşŧīƞɠ")

One problem with that approach is it doesn't degrade gracefully in situations where output is limited (for example) to ascii or latin1. The default behavior is to raise an exception like:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128)

...I'd much prefer escaping such as the 'replace' or 'backslashreplace' error handling methods. Is there a way to configure sys.stdout to use one of these methods? Is that a reasonable thing to do?

Unicode encoding in Python has been discussed a lot on StackOverflow, for example: How to write utf8 to standard output in a way that works with python2 and python3, and snapshoe's answer and a helpful comment by Martijn Pieters. Also Setting the correct encoding when piping stdout in Python.

But I still don't see a clear "best" way, especially with regard to handling errors gracefully.

Community
  • 1
  • 1
cbare
  • 12,060
  • 8
  • 56
  • 63

0 Answers0