0

I have asked A python program fails to execute in sublime text 3, but success in bash . I did some research and found it's necessary to start a new question.

In python2.7, the sys.getdefaultencoding() is ascii:

In [1]: import sys

In [2]: sys.getdefaultencoding()
Out[2]: 'ascii'

In my opinion, print an obj is equal to print str(obj). If obj is unicode, it will be encoded to ascii. For example(test.py):

#-*- encoding:utf-8 -*-
import sys
print sys.getdefaultencoding()  # ascii
print "你好"
print u"你好"  # should be an error occured: UnicodeEncodeError: 'ascii' codec...

But there is no error occured in ipython:

In [3]: print "你好"
你好

In [4]: print u"你好"
你好

Why does print an unicode obj in ipython occur no error? My understanding is not right?

Community
  • 1
  • 1
letiantian
  • 437
  • 2
  • 14

1 Answers1

2

When printing, the default encoding is used when only when Python can't determine the terminal encoding. Both work in your case because the first "你好" is a byte string encoded in the terminal encoding already. The second u"你好" is a Unicode string that will be encoded in the terminal encoding, which the former command already shows supports Chinese.

ascii is used when Python 2.X coerces a Unicode string into a byte string. for example, the statement u'你好'.decode('utf8') is a common error where .decode() is being called on a Unicode string, but only byte strings have a .decode() method. Python then attempts to encode the Unicode string using the default ascii codec to a byte string, so that .decode() can be called.

Example below, but note it is an encode error not a decode error:

>>> u'你好'.decode('utf8')
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
  File "d:\dev\Python27\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251