11

I have the following code written in python 2.7

# -*- coding: utf-8 -*-    
import sys

_string = "años luz detrás"
print _string.encode("utf-8")

this throws the following error:

print _string.encode("utf-8")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)

Any help appreciated, thanks in advance

aledustet
  • 1,003
  • 2
  • 14
  • 39

2 Answers2

9

Add u before the "

>>> _string = u"años luz detrás"
>>> print _string.encode("utf-8")
años luz detrás

This would do.

Bleeding Fingers
  • 6,993
  • 7
  • 46
  • 74
4

In Python 2 a string literal "" creates a bytestring. Then you call .encode("utf-8") on a bytestring, Python tries first to decode it into Unicode string using a default character encoding (ascii) before executing .encode("utf-8").

u"" creates Unicode string. It will fix the UnicodeDecodeError as @Bleeding Fingers suggested.

# -*- coding: utf-8 -*-    
print u"años luz detrás"

It might lead to UnicodeEncodeError if stdout is redirected. Set PYTHONIOENCODING environment variable in this case.

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670