0

I am having having trouble reading a csv file using read_csv in Pandas. Here's the error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

I have tried a bunch of different encoding types with the file I am dealing with and none seem to work. The file is from Google's Search Ads 360 product, which says the csv should be in the 'UFT-16' format. Strangely, if I open the file in Excel and save it as a utf-8 format, I can use read_csv normally.

I've tried the solutions to a similar problem here, but they did not work for me. This is the only code I am running:

import pandas as pd

df = pd.read_csv('path/file.csv')

Edit: I read in the file as tab delimited, and that seemed to work. I still don't understand why I got the error I did when I tried to read it in as a normal csv. Any insight into this would be appreciated!!

1 Answers1

1

Try this encoding:

import pandas as pd

df = pd.read_csv('path/file.csv',encoding='cp1252')
Alejandro A
  • 1,150
  • 1
  • 9
  • 28
  • I tried that one already :( I should note that when I use that encoding or UTF-16, I don't get the same error. When I use one of these encoding, the csv is not loaded properly. The data is smushed into a few columns. – Kyle Zengo Jul 01 '20 at 19:48