0

I want to read the csv file information so that the algorithm has the ability to guess the gender of individuals. The program guesses people through height, weight, and gender footwear numbers.

But I'm faced with this error and I can not fix it:

y.append(line[4])

IndexError: list index out of range

height,weight,n_shoes,sexuality
190,88,44,male
167,66,36,female
182,80,42,male
177,78,43,male
164,59,35,female
183,79,40,male
158,57,36,female
155,52,34,female
193,89,45,male
163,54,35,female

Code:

import csv
from sklearn import tree

x = []
y = []

with open('people.csv' , 'r') as csvfile:
    data = csv.reader(csvfile)    
    for line in data:
        x.append(line[1:4])
        y.append(line[4])


clf = tree.DecisionTreeClassifier()
clf = clf.fit(x , y)

new_data = [[190,89,43] , [160,56,39]]
answer = clf.predict(new_data)

print(answer[0])
print(answer[1])

I want to read the csv file information so that the algorithm has the ability to guess the gender of individuals.

Read the new data from the new_data variable and guess the personality of the person.

For example:

[190 , 89 , 42] ==> male 
[162 , 59 , 37] ==> female
Austin
  • 25,759
  • 4
  • 25
  • 48

2 Answers2

1

Indexes are zero based ... the “fourth” item is line[3]. Alter you loop to use:

for line in data:
    x.append(line[:3])
    y.append(line[3])

(In this case the fourth item is also the last item ... so an alternative is line[-1])

donkopotamus
  • 22,114
  • 2
  • 48
  • 60
0

As the other answer says, you need a small fix:

import csv

from sklearn import tree

x = []
y = []

with open('people.csv' , 'r') as csvfile:
    data = csv.reader(csvfile)
    for line in data:
        x.append(line[0:3])
        y.append(line[3])


clf = tree.DecisionTreeClassifier()
clf = clf.fit(x , y)

new_data = [[190,89,43] , [160,56,39]]
answer = clf.predict(new_data)

print(answer[0])
print(answer[1])

I changed y.append(line[4]) to y.append(line[3]) and x.append(line[0:3]) assuming you want to select first three elements.

Why this happens:

list indexing starts from 0 and you assumed it would start from 1.

This article may help you.

R4444
  • 2,016
  • 2
  • 19
  • 30