0

Basically, I was using pandas to read csv files to separate a column which had "Date + Hour" in the format "dd/mm/yy hh".

I had help here trying to write a script to separate the column in 2 different columns.

First of all, this is what the dataset looked like:

enter image description here

The joint field is "FECHA" and I managed to run this code on some of the csv files:

import pandas as pd,os
sal = pd.read_csv('C:/Users/drivasti/Documents/002_Script_Separa_Fecha_Hora/Anexo2_THP_UL.csv')

df=sal.join(sal['FECHA'].str.partition(' ')[[0, 2]]).rename({0: 'DATE', 2: 'HOUR'}, axis=1)

df.to_csv('C:/Users/drivasti/Documents/002_Script_Separa_Fecha_Hora/Anexo2_THP_UL_2.csv',index=False)

And they worked perfectly as seen here:

enter image description here


However, I encountered this error when I tried running another csv file (note that I change the name of the file everytime I have to run it, but they're all csv files):

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd1 in position 2: invalid continuation byte


Now I have tried some of the answers here but none have helped:

UnicodeDecodeError: 'utf-8' codec can't decode byte
'utf-8' codec can't decode byte 0xdb in position 1:


Anyone might know how to parse this as UTF-8? or is it a problem in the field "FECHA"?

  • 1
    In ISO8859-1 character 0xd1 is Ñ, is there such a thing? Most likely would need to change the encoding to a suitable one if the input data isn’t UTF8. Can’t tell from the top of my head how in pandas – Sami Kuhmonen Aug 01 '19 at 14:37
  • Does it need to be parsed in utf-8? You can pass another encoding by specifying it: pandas.read_csv(file, encoding='ISO8859-1'). If it does need to be utf-8, you will probably have to use open() which lets you escape or replace unkown characters. You can then create a df manually. – Miquel Vande Velde Aug 01 '19 at 15:04
  • It worked using: pandas.read_csv(file, encoding='ISO8859-1') . But I happen to use spanish words and then I need the "Ñ" character. Is there an encoding that includes it? – Daniel Rivas Aug 01 '19 at 15:38

0 Answers0