6

I try to create new sentence from different list items. Its giving error when I print it by unicode. I can print it normally (without unicode). When I try to post it to the web site its rasing same error. I tought that if I can fix it with unicode, it will work when ı post it to the website.

p=['Bu', 'Şu']
k=['yazı','makale']
t=['hoş','ilgiç']
connect='%s %s %s'%(p[randint(0,len(p)-1)],k[randint(0,len(k)-1)],t[randint(0,len(t)-1)])
print unicode(connect)

And the output is :
Error: UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 0: ordinal not in range(128)
Alkindus
  • 2,064
  • 2
  • 16
  • 16

4 Answers4

1

First of all you should put at the top of your script # -*- coding: utf-8 -*- to be able to use non-ascii characters in your script. Also while printing decode str to unicode will solve your problem.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from random import randint

p=['Bu', 'şu']
k=['yazı','makale']
t=['hoş','ilginç']
connect='%s %s %s'%(p[randint(0,len(p)-1)],k[randint(0,len(k)-1)],t[randint(0,len(t)-1)])
print connect.decode('utf-8')
0

You should put a header like this at the top of your script and specify the encoding on your system. It is recommended you read more on this as you might often run into these kind of problems. Some resources here.

#!/usr/bin/env python
# -*- coding: latin-1 -*-

Be sure to substitute the above 'latin-1' with the proper one for you.

Phani
  • 3,267
  • 4
  • 25
  • 50
  • 1
    Turkish is not Latin-1. We use UTF-8 for encoding websites. Adding header doesn't effect result. Its necessary when you use Turkish Chars in coding. – Alkindus Nov 11 '14 at 09:57
  • I thought you did use the Turkish characters when assigning the lists `p`, `k` and `t`. – Phani Nov 11 '14 at 09:58
0
>>> p=['Bu', 'Şu']
>>> k=['yazı','makale']
>>> t=['hoş','ilgiç']
>>> connect='%s %s %s'%(p[randint(0,len(p)-1)],k[randint(0,len(k)-1)],t[randint(0,len(t)-1)])
>>> print connect.decode('utf-8')
Şu makale ilgiç
Irshad Bhat
  • 8,479
  • 1
  • 26
  • 36
0

When using non-ASCII characters, specify the encoding of the source code at the top of the file. Then, use Unicode strings for all text:

#coding:utf8
from random import randint
p=[u'Bu', u'Şu']
k=[u'yazı', u'makale']
t=[u'hoş', u'ilgiç']
connect= u'%s %s %s'%(p[randint(0,len(p)-1)],k[randint(0,len(k)-1)],t[randint(0,len(t)-1)])
print connect

Output:

Şu yazı ilgiç

You could still get UnicodeEncodeError if your execution environment doesn't support the character set. Ideally use an environment that supports an output encoding of UTF-8.

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251