decoding error while reading csv files from folder

Question

I am reading headers of csv files from a folder.

code:

#mypath = folder directory with the csv files
for each_file in listdir(mypath):
  with open(mypath +"//"+each_file) as f:
     first_line = f.readline().strip().split(",")

Error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 3131: invalid start byte

Environment:

Spyder, Python 3

Not able to understand the encoding error since I have not done any encoding.

Check the following link if it may help https://stackoverflow.com/questions/48540170/unicodedecodeerror-when-reading-csv-file-in-pandas-with-python-for-bulgarian-cyr — Aayush Bhatnagar, May 30 '18 at 05:38

score 0 · Answer 1 · answered May 30 '18 at 05:49

0

try using single slash '/'

please try using

with open(mypath +"/"+each_file) as f:

Another problem may be the CSV file contains Unicode, not UTF8. It would be easy if you post sample of CSV file too.

answered May 30 '18 at 05:49

Sudip Ghimire

111
1
5

score 0 · Accepted Answer · answered May 30 '18 at 06:05

Try using encoding while opening the file in the with condition. I tried the below code and worked fine for me. Please try different encoding's and see if any of it works

for each_file in listdir(path):
    with open(path +"//"+each_file,encoding='utf-8') as f:
        first_line = f.readline().strip().split(",")
        print(each_file ,' --> ',first_line)

Also, check this link for checking file encoding for CSV. hope it helps.

How to check encoding of CSV file

Happy Coding :)

score 0 · Answer 3 · answered Jul 03 '18 at 14:26

The built in os.path.join provides a convenient way to join two or more paths, without worrying about Platform specific slashes '/' or '\'.

import os

files = os.listdir(path)

for file in files:
    with open(os.path.join(path, file), encoding='utf-8') as f:
        first_line = str(f.readline()).strip().split(",")
        print(file, ' --> ', first_line)

decoding error while reading csv files from folder

3 Answers3