2

I used an anonymous pipe to capture all stdout,and stderr then print into a richedit, it's ok when i use wsprintf ,but the python using multibyte char that really annoy me. how can I convert all these output to unicode?

UPDATE 2010-01-03:

Thank you for the reply, but it seems the str.encode() only worked with print xxx stuff, if there is an error during the py_runxxx(), my redirected stderr will capture the error message in multibyte string, so is there a way can make python output it's message in unicode way? And there seems to be an available solution in this post.

I'll try it later.

Community
  • 1
  • 1
fancyzero
  • 21
  • 1
  • 2

3 Answers3

9

First, please remember that on Windows console may not fully support Unicode.

The example below does make python output to stderr and stdout using UTF-8. If you want you could change it to other encodings.

#!/usr/bin/python
# -*- coding: UTF-8 -*-

import codecs, sys

reload(sys)
sys.setdefaultencoding('utf-8')

print sys.getdefaultencoding()

sys.stdout = codecs.getwriter('utf8')(sys.stdout)
sys.stderr = codecs.getwriter('utf8')(sys.stderr)

print "This is an Е乂αmp١ȅ testing Unicode support using Arabic, Latin, Cyrillic, Greek, Hebrew and CJK code points."
sorin
  • 161,544
  • 178
  • 535
  • 806
0

You can work with Unicode in python either by marking strings as Unicode (ie: u'Hello World') or by using the encode() method that all strings have.

Eg. assuming you have a Unicode string, aStringVariable:

aStringVariable.encode('utf-8')

will convert it to UTF-8. 'utf-16' will give you UTF-16 and 'ascii' will convert it to a plain old ASCII string.

For more information, see:

Craig McQueen
  • 41,871
  • 30
  • 130
  • 181
Adam Luchjenbroers
  • 4,917
  • 2
  • 30
  • 35
  • 1. It is a bad practice to shadow builtin names (`str()` in this case). 2. `.encode()` should be called on Unicode string and not on byte-string. – jfs Jan 03 '10 at 19:48
  • That was just a bad choice for a variable name. I've changed it to something more obvious. – Adam Luchjenbroers Jan 03 '10 at 22:24
-1

wsprintf?

This seems to be a "C/C++" question rather than a Python question.

The Python interpreter always writes bytestrings to stdout/stderr, rather than unicode (or "wide") strings. It means Python first encodes all unicode data using the current encoding (likely sys.getdefaultencoding()).

If you want to get at stdout/stderr as unicode data, you must decode it by yourself using the right encoding.

Your favourite C/C++ library certainly has what it takes to do that.

EugZol
  • 6,476
  • 22
  • 41
Antoine P.
  • 4,181
  • 1
  • 24
  • 17