1

I have divided my data set into 2 sets, training and test set. I have 6 types of dummy variable in my data. Every time I try to run the model on my training set I get error. This is my code:

X = dmatrix('sfdc_tier + poc_image + sub_segment + Product_Set + Volume_2019_Product', data = data)
x = dmatrix('sfdc_tier + poc_image + sub_segment + Product_Set + Volume_2019_Product', data = Training_Set)
Y = data['Discount_Total')
model = sm.OLS(Y,X).fit()
y_pred = model.predict(x)

Note that 'Volume_2019_Product' is the only numerical data and rest of the data is categorical.

The error I am getting is the following:

ValueError: shapes (662,69) and (90,) not aligned: 69 (dim 1) != 90 (dim 0)

How do I resolve this error?? I need my Training data matrix to look exactly like the original dmatrix of X. Training data contains the same column heads as the other dataset on which I trained the model but it doesn't contain each and every categorical variable under heads which is creating an error in model prediction.

0 Answers0