-2

I am interested in performing machine learning by using SKLEARN To any database I am interested in matching the quantity to all the existing data.

Of course most of the data is not pure and there is a MIX of numbers and letters, And the data are mostly categorical

I would love an idea of why this does not work, and how it can be used to match the quantity and / or customer to the rest of the data

I would love recommendations

It is important to note - most of the entries are not numeric and they should indicate a serial number, ID cards, a category number, part numbers, etc.

from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.metrics import r2_score
from sklearn.linear_model import LinearRegression
y = df['CatID']
x = df.drop(columns=["CreateDate",'DestDate',"CatID"])
X = x.values
Y = y.values
lin_reg = LinearRegression()
lin_reg.fit(X,Y)    # error line

ValueError: could not convert string to float: 'R17'

enter image description here

enter image description here

edit

After I LabelEncoder by SKLEARN I cant convert Object to STR

8   U1TypeID        14099 non-null  float64       
9   CatID           14099 non-null  object   <-----       
10  CurrencyID      14099 non-null  int64         
12  RowNumber       14099 non-null  int64         
15  SellerID        14099 non-null  object  /<----      
16  AgentID         14099 non-null  float64    

   
df[['CatID']] = str(df[['CatID']])
df['CatID'].str
df['CatID'] = df['CatID'].astype(str)

Its not convert to string!

1 Answers1

1

The column "sellerId" is a string and linearReg or any kind of ML accepts only integer/float variable (they don't understand string variable).
so you should transform the "sellerId" column before feeding the data to your ML model. I suggest that you

  1. transform it by using LabelEncoder from scikit-learn.
    Or
  2. you can extract the numbers from the column for example 'R17' => 17 by applying the lambda function to your column:
    df['sellerId'] = df['sellerId'].apply(lambda x: int(x[1:]))
  • thank you so much-now i need convert object to string – Aviran Marzouk Feb 28 '22 at 14:37
  • You're welcome, to convert a column from object to string use this method df['column'] = df['column'].astype('str') – Hamza Ghanmi Feb 28 '22 at 15:07
  • You should convert your columns from object => str as the first step and then apply the transformation, not the inverse. check this link if you have a problem with converting your columns from object to str https://stackoverflow.com/questions/33957720/how-to-convert-column-with-dtype-as-object-to-string-in-pandas-dataframe – Hamza Ghanmi Feb 28 '22 at 15:47