The problem.
I'm using Python 2.7 build on Sublime Text 3 and have an issue with printing out.
In some cases I get a pretty confusing output for '\uFFFD'
- the 'REPLACEMENT CHARACTER'
.
For example:
print u'\ufffd' # should be '�' - the 'REPLACEMENT CHARACTER'
print u'\u0061' # should be 'a'
-----------------------------------------------------
[Finished in 0.1s]
After inversion of the order:
print u'\u0061'
print u'\ufffd'
-----------------------------------------------------
a
�
[Finished in 0.1s]
So, Sublime can printout the '�' character, but for some reason doesn't do it in the 1st case.
And the dependence of the output on the order of statements seems quite strange.
The problem with replacement char leads to very unpredictable printout behavior in general.
For example, I want to printout decoded bytes with error replacement:
cp1251_bytes = '\xe4\xe0' # 'да' in cp1251
print cp1251_bytes.decode('utf-8', errors='replace')
-----------------------------------------------------
��
[Finished in 0.1s]
Let's replace the bytes:
cp1251_bytes = '\xed\xe5\xf2' # 'нет' in cp1251
print cp1251_bytes.decode('utf-8', errors='replace')
-----------------------------------------------------
[Finished in 0.1s]
And add one more print statement:
cp1251_bytes = '\xed\xe5\xf2' # 'нет' in cp1251
print cp1251_bytes.decode('cp1251')
print cp1251_bytes.decode('utf-8', errors='replace')
-----------------------------------------------------
нет
���
[Finished in 0.1s]
Below is the illustration of implementation some other test cases:
Summarizing, there are the following patterns in the described printout behavior:
'\ufffd'
chars in print statement
My questions:
My Python 2.7 sublime-build file:
{
"cmd": ["C:\\_Anaconda3\\envs\\python27\\python", "-u", "$file"],
"file_regex": "^[ ]*File \"(...*?)\", line ([0-9]*)",
"selector": "source.python",
"env": {"PYTHONIOENCODING": "utf-8"}
}
With Python 2.7 installed separately from Anaconda the behavior is exactly the same.