4

My boss asked me to put the following lines (from this answer) into a Python 3 script I wrote:

import sys
import codecs
sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach()) 

He says it's to prevent UnicodeEncodeErrors when printing Unicode characters in non-UTF8 locales. I am wondering whether this is really necessary, and why Python wouldn't handle encoding/decoding correctly without boilerplate code.

What is the most Pythonic way to make Python scripts compatible with different operating system locales? And what does this boilerplate code do exactly?

Nisse Engström
  • 4,738
  • 23
  • 27
  • 42
Jaap Joris Vens
  • 3,382
  • 2
  • 26
  • 42
  • 2
    Did you read the most upvoted answer, that starts with "Eek! Is that a well-known idiom in Python 2? It looks like a dangerous mistake to me." Furthermore, that was recommended for python2 and even the question specifically says it doesn't work in python3! Elsewhere, answers to that same question point out " sys.stdout is in text mode in Python 3. Hence you write unicode to it directly, and the idiom for Python 2 is no longer needed." – GreenAsJade Aug 27 '16 at 10:20

1 Answers1

5

The answer provided here has a good excerpt from the Python mailing list regarding your question. I guess it is not necessary to do this.

The only supported default encodings in Python are:

Python 2.x: ASCII
Python 3.x: UTF-8

If you change these, you are on your own and strange things will start to happen. The default encoding does not only affect the translation between Python and the outside world, but also all internal conversions between 8-bit strings and Unicode.

Hacks like what's happening in the pango module (setting the default encoding to 'utf-8' by reloading the site module in order to get the sys.setdefaultencoding() API back) are just downright wrong and will cause serious problems since Unicode objects cache their default encoded representation.

Please don't enable the use of a locale based default encoding.

If all you want to achieve is getting the encodings of stdout and stdin correctly setup for pipes, you should instead change the .encoding attribute of those (only).

--
Marc-Andre Lemburg
eGenix.com

Nisse Engström
  • 4,738
  • 23
  • 27
  • 42
navid
  • 566
  • 6
  • 15