I have a large project which runs fine with 2.7.9 on many devices.
But now the devices use python 2.7.15 and in some cases it crashes, when someone uses umlaute/eszett like äöüß
.
In that case, a line like this raised an exception
logger.info("device name {}".format(device_name))
I build a minimal test.py to reproduce the problem.
# -*- coding: utf-8 -*-
import locale
import os
import sys
print("#1 sys.stdout.encoding={}".format(sys.stdout.encoding))
print("#2 {}".format(locale.getdefaultlocale()))
u = u'aé ä ö ü ß'
print("#repr: " + repr(u.encode('utf-8')))
print("#3 type(u)={}".format(type(u)))
print(u.encode('utf-8', errors='ignore'))
print("#5 u={}".format(u))
With python 2.7.9 it's fine
#1 sys.stdout.encoding=ANSI_X3.4-1968 #2 (None, None) #repr: 'a\xc3\xa9 \xc3\xa4 \xc3\xb6 \xc3\xbc \xc3\x9f' #3 type(u)=<type 'unicode'> aé ä ö ü ß #5 u=aé ä ö ü ß
This fails only with 2.7.15, output:
#1 sys.stdout.encoding=ANSI_X3.4-1968 #2 (None, None) #repr: 'a\xc3\xa9 \xc3\xa4 \xc3\xb6 \xc3\xbc \xc3\x9f' #3 type(u)=<type 'unicode'> aé ä ö ü ß Traceback (most recent call last): File "utf8.py", line 16, in <module> print("#5 u={}".format(u)) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)
Even when I used:
export PYTHONIOENCODING="UTF-8"
export LC_ALL=en_GB.utf8
export LANG=en_GB.utf8
This alters the output, but doesn't help
#1 sys.stdout.encoding=UTF-8 #2 ('en_GB', 'UTF-8') #repr: 'a\xc3\xa9 \xc3\xa4 \xc3\xb6 \xc3\xbc \xc3\x9f' #3 type(u)=<type 'unicode'> aé ä ö ü ß Traceback (most recent call last): File "utf8.py", line 16, in <module> print("#5 u={}".format(u)) UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 1: ordinal not in range(128)
I can fix this error with:
reload(sys)
sys.setdefaultencoding('utf8')
But this solution seems to be very discouraged and I fear the side effects.
But how to fix it in a sane way?
Currently, update to python3 isn't an option.