4

Possible Duplicate:
How to display utf-8 in windows console

python "print" statement doesn't work.

To avoid this error,

'ascii' codec can't decode byte 0xec in position 0: ordinal not in range(128)

I put a few statements in my code as below

import sys

reload(sys)

sys.setdefaultencoding('utf-8')

before the code, "print" works well. However, after the code, "print" doesn't work.

print "A"

import sys

reload(sys)

sys.setdefaultencoding('utf-8')

print "B"

Here, only "A" printed on my computer, Python2.7.3 (64bit) for Windows. Python2.7 IDLE

I need help

Community
  • 1
  • 1
Park
  • 2,446
  • 1
  • 16
  • 25
  • 1
    `sys.setdefaultencoding` is removed from `sys` once the `site` module is imported. This is documented and is done for a reason. You can't use `sys.defaultencoding` this way. See the linked question. – BrenBarn Nov 18 '12 at 07:37
  • Actually, I solved this problem by using Aptana Studio 3 instead of using IDLE. It might be because default encoding settings are different each other. – Park Nov 22 '12 at 17:49

1 Answers1

4

The sys.setdefaultencoding is removed for a reason by site and you shouldn't use reload(sys) to restore it. Instead, my solution would be to do nothing, Python automatically detects encoding basing on ENV LANG variable or Windows chcp encoding.

$ python
Python 2.7.3 (default, Sep 26 2012, 21:51:14) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> import os
>>> sys.stdout.encoding
'UTF-8'
>>> os.environ["LANG"]
'pl_PL.UTF-8'
>>> print u"\xabtest\xbb"
«test»
>>>

But that could cause issues when encoding doesn't have characters you want. You should instead try degrading gracefully - the chance of displaying characters you want is close to 0 (so you should try using pure-ASCII version, or use Unidecode to show usable output (or simply fail)). You could try catching exception and printing basic version of string instead.

$ LANG=C python
Python 2.7.3 (default, Sep 26 2012, 21:51:14) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> import os
>>> sys.stdout.encoding
'ANSI_X3.4-1968'
>>> os.environ["LANG"]
'C'
>>> print u"\xabtest\xbb"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xab' in position 0: ordinal not in range(128)
>>>

But there is problem called Windows that has problems with Unicode support. While technically chcp 65001 should work, it doesn't actually work unless you're using Python 3.3. Python uses portable stdio.h, but cmd.exe expects Windows specific calls, like WriteConsoleW(). Only 8-bit encodings work reliably (such as CP437), really.

The workaround would be to use other terminal that supports Unicode properly, such as Cygwin's console or IDLE included with Python.

Konrad Borowski
  • 11,584
  • 3
  • 57
  • 71