14

I have a problem in Python with Unicode. I need plot a graph with Unicode annotations in it. According to the tutorial I should just create my string in Unicode. I do it like this:

annotation = u"%s has %s rev"%(art.title, len(art.revisions))

It is art.title that has Unicode characters in it. Sometimes that code works, sometimes it gives me the error below:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)

How can I fix it?

EDIT: I have error exactly after "annotation" line:

  File "script.py", line 195, in test_trie
annotation = u"%s has %s rev"%(art.title, len(art.revisions))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 19: ordinal not in range(128)
ashim
  • 24,380
  • 29
  • 72
  • 96
  • 1
    Where is `art.title` coming from? – Thomas K Apr 20 '12 at 00:36
  • Are you sure the error message is given for the code line you gave us? I suspect the error actually occurs when you print out `annotation`. In that case, could you show that code line as well? – jogojapan Apr 20 '12 at 03:26
  • @jogojapan yes, see edit to the question. – ashim Apr 20 '12 at 03:40
  • Have you tried `annotation = u"%s has %d rev" % (art.title.decode('utf-8'), len(art.revisions))`, as suggested by Maksym Kozlenko below? – jogojapan Apr 20 '12 at 03:47

2 Answers2

9

I think it depends if your title has a unicode characters or not.

I would try adding art.title.encode("utf-8") or art.title.decode("utf-8") and see how it works

Blorgbeard
  • 101,031
  • 48
  • 228
  • 272
Maksym Kozlenko
  • 10,273
  • 2
  • 66
  • 55
5

You have two options: Either use art.title.decode('utf_8'), or create a new Unicode string with UTF-8 encoding by unicode(art.title, 'utf_8').

Makoto
  • 104,088
  • 27
  • 192
  • 230
  • Those two expressions do the same thing, and the `decode` would be my choice. For a similar situation see http://stackoverflow.com/questions/7585435/best-way-to-convert-string-to-bytes-in-python-3 – Mark Ransom Apr 20 '12 at 03:59