The sys.setdefaultencoding
is removed for a reason by site
and you shouldn't use reload(sys)
to restore it. Instead, my solution would be to do nothing, Python automatically detects encoding basing on ENV LANG variable or Windows chcp
encoding.
$ python
Python 2.7.3 (default, Sep 26 2012, 21:51:14)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> import os
>>> sys.stdout.encoding
'UTF-8'
>>> os.environ["LANG"]
'pl_PL.UTF-8'
>>> print u"\xabtest\xbb"
«test»
>>>
But that could cause issues when encoding doesn't have characters you want. You should instead try degrading gracefully - the chance of displaying characters you want is close to 0 (so you should try using pure-ASCII version, or use Unidecode to show usable output (or simply fail)). You could try catching exception and printing basic version of string instead.
$ LANG=C python
Python 2.7.3 (default, Sep 26 2012, 21:51:14)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> import os
>>> sys.stdout.encoding
'ANSI_X3.4-1968'
>>> os.environ["LANG"]
'C'
>>> print u"\xabtest\xbb"
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xab' in position 0: ordinal not in range(128)
>>>
But there is problem called Windows that has problems with Unicode support. While technically chcp 65001
should work, it doesn't actually work unless you're using Python 3.3. Python uses portable stdio.h
, but cmd.exe
expects Windows specific calls, like WriteConsoleW()
. Only 8-bit encodings work reliably (such as CP437), really.
The workaround would be to use other terminal that supports Unicode properly, such as Cygwin's console or IDLE included with Python.