What's the best way to do proper output of non-ascii characters that's Python 2 and 3 compatible? Is it this?
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from __future__ import print_function
print("ȧƈƈḗƞŧḗḓ uʍop-ǝpısdn ŧḗẋŧ ƒǿř ŧḗşŧīƞɠ")
One problem with that approach is it doesn't degrade gracefully in situations where output is limited (for example) to ascii
or latin1
. The default behavior is to raise an exception like:
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128)
...I'd much prefer escaping such as the 'replace' or 'backslashreplace' error handling methods. Is there a way to configure sys.stdout
to use one of these methods? Is that a reasonable thing to do?
Unicode encoding in Python has been discussed a lot on StackOverflow, for example: How to write utf8 to standard output in a way that works with python2 and python3, and snapshoe's answer and a helpful comment by Martijn Pieters. Also Setting the correct encoding when piping stdout in Python.
But I still don't see a clear "best" way, especially with regard to handling errors gracefully.