UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 2: invalid continuation byte

Question

Basically, I was using pandas to read csv files to separate a column which had "Date + Hour" in the format "dd/mm/yy hh".

I had help here trying to write a script to separate the column in 2 different columns.

First of all, this is what the dataset looked like:

The joint field is "FECHA" and I managed to run this code on some of the csv files:

import pandas as pd,os
sal = pd.read_csv('C:/Users/drivasti/Documents/002_Script_Separa_Fecha_Hora/Anexo2_THP_UL.csv')

df=sal.join(sal['FECHA'].str.partition(' ')[[0, 2]]).rename({0: 'DATE', 2: 'HOUR'}, axis=1)

df.to_csv('C:/Users/drivasti/Documents/002_Script_Separa_Fecha_Hora/Anexo2_THP_UL_2.csv',index=False)

And they worked perfectly as seen here:

However, I encountered this error when I tried running another csv file (note that I change the name of the file everytime I have to run it, but they're all csv files):

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 2: invalid continuation byte

Now I have tried some of the answers here but none have helped:

UnicodeDecodeError: 'utf-8' codec can't decode byte
'utf-8' codec can't decode byte 0xdb in position 1:

Anyone might know how to parse this as UTF-8? or is it a problem in the field "FECHA"?

In ISO8859-1 character 0xd1 is Ñ, is there such a thing? Most likely would need to change the encoding to a suitable one if the input data isn’t UTF8. Can’t tell from the top of my head how in pandas — Sami Kuhmonen, Aug 01 '19 at 14:37
Does it need to be parsed in utf-8? You can pass another encoding by specifying it: pandas.read_csv(file, encoding='ISO8859-1'). If it does need to be utf-8, you will probably have to use open() which lets you escape or replace unkown characters. You can then create a df manually. — Miquel Vande Velde, Aug 01 '19 at 15:04
It worked using: pandas.read_csv(file, encoding='ISO8859-1') . But I happen to use spanish words and then I need the "Ñ" character. Is there an encoding that includes it? — Daniel Rivas, Aug 01 '19 at 15:38

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 2: invalid continuation byte

0 Answers0