0

I'm reading a utf8 encoded csv file with python

f = open('test.csv', 'r')
reader = csv.reader(f)
r = reader.next()
r[10]

which returns

'\xc3\x9altima actualizaci\xc3\xb3n del ejercicio 2014: 27 Abril 2014.'

That should be 'Última actualización del ...'

I'm just wondering how is that data encoded (multy byte perhaps?) and how can I convert it to a normal string with the following content: 'Última actualización del ...'

I tried with:

r[10].decode('utf8')

but I got

u'\xdaltima actualizaci\xf3n del ejercicio 2014: 27 Abril 2014.'
opensas
  • 60,462
  • 79
  • 252
  • 386
  • 1
    Which version of python are you using? Which IDE? Have you read [this](https://docs.python.org/2/howto/unicode.html)? – jonrsharpe May 13 '14 at 06:56
  • 1
    This is a very good read, It will explain what is going on. http://www.joelonsoftware.com/articles/Unicode.html – JensB May 13 '14 at 06:57

1 Answers1

0

Maybe try:

import codecs
with codecs.open('test.csv', 'r',  'utf-8') as f:
    ...

Similar answer: Difference between open and codecs.open in Python

Community
  • 1
  • 1
user3588162
  • 116
  • 4