I have the following Python script:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
print('☺')
When I run it on my Debian system, it produces the following output, as expected:
$ ./test.py
☺
$
However, when I change locale to "C", by setting the LANG
environment variable, the script throws a UnicodeEncodeError
:
$ LANG=C ./test.py
Traceback (most recent call last):
File "./test.py", line 4, in <module>
print('\u263a')
UnicodeEncodeError: 'ascii' codec can't encode character '\u263a' in position 0: ordinal not in range(128)
$
This problem prevents this script from being executed in minimal environments, such as during boot or in embedded systems. Also, I suspect that many existing Python programs can be broken by executing them with LANG=C
. Here's an example on Stackoverflow of a program that presumably broke because it's executed in the "C"-locale.
Is this a bug in Python? What's the best way to prevent this?