I'm on the CMD in Windows 8 and I've set the codepage to 65001 (chcp 65001
). I'm using Python 2.7.2 (ActivePython 2.7.2.5) and I've set the PYTHONSTARTUP environment variable to "bootstrap.py".
bootstrap.py:
import codecs
codecs.register(
lambda name: name == 'cp65001' and codecs.lookup('UTF-8') or None
)
This lets me print ASCII:
>>> print 'hello'
hello
>>> print u'hello'
hello
But the errors I get when I try to print a Unicode string with non-ASCII characters makes no sense to me. Here I try to print a few strings containing Nordic symbols (I added the extra line break between the prints for readability):
>>> print u'æøå'
��øåTraceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: [Errno 2] No such file or directory
>>> print u'åndalsnes'
��ndalsnes
>>> print u'åndalsnesæ'
��ndalsnesæTraceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: [Errno 22] Invalid argument
>>> print u'Øst'
��st
>>> print u'uØst'
uØstTraceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: [Errno 22] Invalid argument
>>> print u'ØstÆØÅæøå'
��stÆØÅæøåTraceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: [Errno 22] Invalid argument
>>> print u'_ØstÆØÅæøå'
_ØstÆØÅæøåTraceback (most recent call last):
File "<stdin>", line 1, in <module>
IOError: [Errno 22] Invalid argument
As you see it doesn't always raise an error (and doesn't even raise the same error every time), and the Nordic symbols is only displayed correctly occasionally.
Can somebody explain this behavior, or at least help me figure out how to print Unicode to the CMD correctly?