I am interested in performing machine learning by using SKLEARN To any database I am interested in matching the quantity to all the existing data.
Of course most of the data is not pure and there is a MIX of numbers and letters, And the data are mostly categorical
I would love an idea of why this does not work, and how it can be used to match the quantity and / or customer to the rest of the data
I would love recommendations
It is important to note - most of the entries are not numeric and they should indicate a serial number, ID cards, a category number, part numbers, etc.
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.metrics import r2_score
from sklearn.linear_model import LinearRegression
y = df['CatID']
x = df.drop(columns=["CreateDate",'DestDate',"CatID"])
X = x.values
Y = y.values
lin_reg = LinearRegression()
lin_reg.fit(X,Y) # error line
ValueError: could not convert string to float: 'R17'
edit
After I LabelEncoder
by SKLEARN
I cant convert Object to STR
8 U1TypeID 14099 non-null float64
9 CatID 14099 non-null object <-----
10 CurrencyID 14099 non-null int64
12 RowNumber 14099 non-null int64
15 SellerID 14099 non-null object /<----
16 AgentID 14099 non-null float64
df[['CatID']] = str(df[['CatID']])
df['CatID'].str
df['CatID'] = df['CatID'].astype(str)
Its not convert to string!