I have a CSV file with 20501 rows and 26 columns. I want to select 5 column and 9 columns data. Here is what i have
import csv
filename = 'feed_data.csv'
f = open(filename)
readCSV = csv.reader(f, delimiter=',')
names = []
confidence_score = []
for row in readCSV:
names.append(row[8])
confidence_score.append(row[4])
here is the error
Traceback (most recent call last):
File "C:/Users/raady/PycharmProjects/feeder_Classification/test.py", line 10, in <module>
for row in readCSV:
File "C:\Users\raady\AppData\Local\Programs\Python\Python36\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 1009: character maps to <undefined>
how to rectify the error? I don't want to use pandas.
Is there any way that both columns can be copied only to one variable, instead of names and confidence_score seperately?
Edit: I have installed python 3.6 and pycharm environment. I have installed all the packages from the pycharm environment.
Edit 2:
I have tried in the suggested link by modifying f=open(filename,encoding='utf8')
, but I still have the error UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 934: invalid start byte
.
The CSV file has been encoded in utf8.
Edit 3: I slightly modified code like this
import csv
filename = 'feed_data.csv'
# filename = 'test.csv'
with open(filename) as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
data2 = []
for row in readCSV:
data = []
data.append(row[14]) # appending names
data.append(row[5]) # appending confidence
data2.append(data)
print(data2)
I am adding the two files test.py and feed_data( directly downloaded from kaggle). When I try with test.py it is working fine and I am able to select required column data but not with feed_data.py and it gives the error mentioned above.