0

I am trying to import file to PyCharm using pandas here is my code:

 import pandas as pd
 data=pd.read_csv(r'C:\Users\agns1\Downloads\data_work_final.csv')

sadly I'm getting this error:

 File "pandas\_libs\parsers.pyx", line 542, in pandas._libs.parsers.TextReader.__cinit__
 File "pandas\_libs\parsers.pyx", line 642, in pandas._libs.parsers.TextReader._get_header
 File "pandas\_libs\parsers.pyx", line 843, in pandas._libs.parsers.TextReader._tokenize_rows
 File "pandas\_libs\parsers.pyx", line 1917, in pandas._libs.parsers.raise_parser_error
 UnicodeDecodeError: 'utf-8' codec can't decode byte 0xef in position 4: invalid continuation 
 byte

any thoughts on how can I fix this ?

bad_coder
  • 11,289
  • 20
  • 44
  • 72
Eliza R
  • 125
  • 1
  • 10
  • 1
    Does this answer your question? [python: UnicodeDecodeError: 'utf8' codec can't decode byte 0xc0 in position 0: invalid start byte](https://stackoverflow.com/questions/23772144/python-unicodedecodeerror-utf8-codec-cant-decode-byte-0xc0-in-position-0-i) – bad_coder Nov 29 '21 at 14:29

1 Answers1

0

You need to check the file encoding:

with open(r'C:\Users\agns1\Downloads\data_work_final.csv', 'rb') as rawdata:
    result = chardet.detect(rawdata.read(10000))


print(result)

You'll get something like:

{'encoding': <'the actual encoding'>, 'confidence': xxx, 'language': xxxx}

Then do:

data=pd.read_csv(r'C:\Users\agns1\Downloads\data_work_final.csv', encoding='<'the actual encoding'>')