python, and unicode stderr

Question

I used an anonymous pipe to capture all stdout,and stderr then print into a richedit, it's ok when i use wsprintf ,but the python using multibyte char that really annoy me. how can I convert all these output to unicode?

UPDATE 2010-01-03:

Thank you for the reply, but it seems the str.encode() only worked with print xxx stuff, if there is an error during the py_runxxx(), my redirected stderr will capture the error message in multibyte string, so is there a way can make python output it's message in unicode way? And there seems to be an available solution in this post.

I'll try it later.

score 9 · Answer 1 · answered Jan 04 '10 at 19:54

First, please remember that on Windows console may not fully support Unicode.

The example below does make python output to stderr and stdout using UTF-8. If you want you could change it to other encodings.

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import codecs, sys

reload(sys)
sys.setdefaultencoding('utf-8')

print sys.getdefaultencoding()

sys.stdout = codecs.getwriter('utf8')(sys.stdout)
sys.stderr = codecs.getwriter('utf8')(sys.stderr)

print "This is an Е乂αmp١ȅ testing Unicode support using Arabic, Latin, Cyrillic, Greek, Hebrew and CJK code points."

Nice! Although replacing stdout and stderr was not necessary in my case. — derflocki, Jul 08 '14 at 15:50

score 0 · Answer 2 · edited Jan 07 '10 at 07:33

0

You can work with Unicode in python either by marking strings as Unicode (ie: u'Hello World') or by using the encode() method that all strings have.

Eg. assuming you have a Unicode string, aStringVariable:

aStringVariable.encode('utf-8')

will convert it to UTF-8. 'utf-16' will give you UTF-16 and 'ascii' will convert it to a plain old ASCII string.

For more information, see:

edited Jan 07 '10 at 07:33

Craig McQueen

41,871
30
130
181

answered Jan 03 '10 at 06:29

Adam Luchjenbroers

4,917
2
30
35

1. It is a bad practice to shadow builtin names (`str()` in this case). 2. `.encode()` should be called on Unicode string and not on byte-string. – jfs Jan 03 '10 at 19:48
That was just a bad choice for a variable name. I've changed it to something more obvious. – Adam Luchjenbroers Jan 03 '10 at 22:24

score -1 · Answer 3 · edited Aug 09 '15 at 15:14

wsprintf?

This seems to be a "C/C++" question rather than a Python question.

The Python interpreter always writes bytestrings to stdout/stderr, rather than unicode (or "wide") strings. It means Python first encodes all unicode data using the current encoding (likely sys.getdefaultencoding()).

If you want to get at stdout/stderr as unicode data, you must decode it by yourself using the right encoding.

Your favourite C/C++ library certainly has what it takes to do that.

python, and unicode stderr

3 Answers3