2

some chunk of code like this:

city_name = obj['city_from']['name'].encode('utf-8')
            print(city_name)

The output from this code is:

b'\xd8\xa8\xd9\x86\xd8\xaf\xd8\xb1\xd8\xb9\xd8\xa8\xd8\xa7\xd8\xb3'

and if i remove encode('utf-8') output change like this:

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-7: ordinal not in range(128)

this output language is persian(like arabic), i wonder why the string class in python3 does not have any decode method? Do you have any solutions to this problem?

thanks

Alex Koukoulas
  • 998
  • 1
  • 7
  • 21
Alireza Davoodi
  • 749
  • 7
  • 20
  • Your terminal expects ASCII, so Python complies. Try changing the character set your terminal uses to UTF-8. – chepner Mar 20 '14 at 19:04

2 Answers2

2

Your answer shows that your terminal accepts utf-8 byte sequences.

You don't need to convert Unicode string into bytes before printing them. Python does it for you.

To change the character encoding that Python uses for I/O; set PYTHONIOENCODING=utf-8 environment variable or change your locale settings.

It looks like sys.stdout.encoding is ascii in your case.

$ python3 -c'import sys; print(sys.stdout.encoding)' 
UTF-8
$ python3 -c'import sys; print(sys.stdout.encoding)' | cat
ascii
$ LC_CTYPE=C python3 -c'import sys; print(sys.stdout.encoding)' 
ANSI_X3.4-1968

ANSI_X3.4-1968 is a canonical name for ascii.

$ PYTHONIOENCODING=uTf-8 python3 -c'import sys; print(sys.stdout.encoding)' | cat
uTf-8
$ LC_CTYPE=C.UTF-8 python3 -c'import sys; print(sys.stdout.encoding)' 
UTF-8

Don't hardcode the character encoding inside your scripts. Print Unicode strings and configure your environment appropriately instead

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • OH :| my ZSH it is problem, when i use zsh encoding is ascii – Alireza Davoodi Mar 21 '14 at 09:49
  • Do you have `C.UTF-8` locale (run `locale -a`)? Try `LC_ALL=en_US.utf8 python3 ...` to override other locale settings (priority: `LC_ALL` > `LC_CTYPE` > `LANG`). Note: How to change the locale permanently is system-dependent e.g., on my system there is `/etc/default/locale` file: `LANG="en_US.UTF-8"` – jfs Mar 21 '14 at 10:06
1

okey i found my solution and it is working like a charm

import sys
sys.stdout.buffer.write(TestText2)

UPDATE: this problem for ZSH script environment, i use bash and everything is find.

Alireza Davoodi
  • 749
  • 7
  • 20
  • Since you've now proven your terminal/console is capable of showing UTF-8, see this answer to be able to use `print` without encoding: http://stackoverflow.com/a/1169209/5987 – Mark Ransom Mar 20 '14 at 19:51
  • 1
    @MarkRansom: How bytes are interpreted as text is defined by *user environment*. Printing bytes that represent text "as is" as well as hardcoding the character encoding in your script is too assuming about the user environment. [Either the environment should be fixed for all programs (in any language) that may produce non-ascii output or `PYTHONIOENCODING` may be configured explicitely for the script](http://stackoverflow.com/a/22552581/4279) – jfs Mar 21 '14 at 08:02