0

I have a CSV file containing car information from truecar.com and I want to predict the price of a car with this data but I get an error. here is the traceback :

File "x\Python39\lib\site-packages\numpy\core\_asarray.py", line 102, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: could not convert string to float: 'exterior_color'

code :

import csv
from sklearn import tree

x = [] 
y = [] 

with open('x', 'r') as csv_file:
    data = csv.reader(csv_file)
    for line in data:
        x.append(line[0:-2])
        y.append(line[-1])

    # print(x)
    # print(y)

clf = tree.DecisionTreeClassifier()
clf = clf.fit(x, y)
Xus
  • 198
  • 13

1 Answers1

0

DecisionTreeClassifier's fit method takes arrays of float in it's X parameter(documentation).

I would suggest you to one hot encode your non-numeric variables. I would suggest you reading some articles about this methodology that transforms a column of categorical data into multiple columns of boolean values.

Passing categorical data to Sklearn Decision Tree

Why One-Hot Encode Data in Machine Learning?

arhr
  • 1,505
  • 8
  • 16