Pandas read_csv() doesn't work on CSV file in Python 3?

Question

I used Python2.7.10 before. Recently I change to python 3.6. However, when I want to import csv files it fails. My simple code is like this and I think it should work well in Python2.

data = pd.read_csv('data.csv')

And the error returns like:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

What does this mean and how can I solve this problem? Thanks.

Update

I've already solved it adding something like this:

data = pd.read_csv(data.csv',sep='\t',encoding='utf-16')

Although I still don't know why it works, thanks for your help anyway.

What is the encoding of your CSV file? Can you show a small sample CSV demonstrating the problem? — BrenBarn, Feb 03 '18 at 08:06
try passing `encoding = "ISO-8859-1"` as a parameter to `read_csv` — Vivek Kalyanarangan, Feb 03 '18 at 08:13
this is the list of endocings.use the one suits your data. __ [unicodes link](https://docs.python.org/3/library/codecs.html#standard-encodings) — A_emperio, Feb 03 '18 at 08:18
Sorry. Actually I have no idea what my CSV file encoding is. How can I check the encoding type? In fact, the data is downloaded from CSMAR, if you guys know. — Truefan, Feb 03 '18 at 08:37
Possible duplicate of ['utf-8' codec can't decode byte 0x92 in position 18: invalid start byte](https://stackoverflow.com/questions/46000191/utf-8-codec-cant-decode-byte-0x92-in-position-18-invalid-start-byte) — Sociopath, Feb 03 '18 at 09:32

score 0 · Answer 1 · answered Feb 03 '18 at 09:29

0

I just had this problem. This post helped me

'utf-8' codec can't decode byte 0x92 in position 18: invalid start byte

and my encoding ended up being Windows codepage 1252

read utf-8 CSV file into dataframe

but your encoding could be anything...

answered Feb 03 '18 at 09:29

MissBleu

175
2
15

Pandas read_csv() doesn't work on CSV file in Python 3?

Update

1 Answers1