0

The Python 3.4 and Python 3.8/3.9 are different when I try execute below statement:

print('\u212B')

Python 3.8/3.9 can print it correctly.

Python 3.4 will report an exception:

Traceback (most recent call last):
  File "test.py", line 9, in <module>
    print('\u212B')
UnicodeEncodeError: 'gbk' codec can't encode character '\u212b' in position 0: illegal multibyte sequence

And according to this page, I can avoid the exception by overwrite sys.stdout via statement:

sys.stdout = io.TextIOWrapper(buffer=sys.stdout.buffer,encoding='utf-8')

But python 3.4 still print different charactor as below:

鈩?

So my questions are:

  1. Why do different python versions have different behaviors on stand output print?
  2. How can I print correct value in python 3.4?

Edit 1:

I guess the difference is caused by PEP 528 -- Change Windows console encoding to UTF-8. But I still don't understand the machanism of console encoding and how I can print correct character in Python 3.4.


Edit 2:

One more difference, sys.getfilesystemencoding() will get utf-8 in Python 3.8/3.9 and get mbcs in Python 3.4.

ZMJ
  • 337
  • 2
  • 11

1 Answers1

1

Why?

Regarding the rationale behind the stdout encoding you can read more in the answers here: Changing default encoding of Python?

In short, Python 3.4 is using your OS's encoding by default as the one for stdout whereas with Python 3.8 it is set to UTF-8.

How to fix this?

You can use a new method - reconfigure introduced with Python 3.7:

sys.stdout.reconfigure(encoding='utf-8')

Typically, you can try setting the environment variable PYTHONIOENCODING to utf-8:

set PYTHONIOENCODING=utf8

in most of the operating systems except Windows where another environment variable must be set for it to work:

set PYTHONLEGACYWINDOWSIOENCODING=1

You can fix it in the version of Python preceding v. 3.7 via installing win-unicode-console package that handles UTF issues transparently on Windows:

pip install win-unicode-console

If you are not running the code directly from a console there is a possibility that your IDE configuration is interfering.

sophros
  • 14,672
  • 11
  • 46
  • 75