12

I am working on building a multivariate regression analysis on sklearn , I did a thorough look at the documentation. When I run the predict() function I get the error : predict() takes 2 positional arguments but 3 were given

X is a data frame , y is column; I have tried to convert the data frame to array / matrix but still get the error.

Have added a snippet showing the x and y arrays.

reg.coef_
reg.predict(x,y)

x_train=train.drop('y-variable',axis =1)
y_train=train['y-variable']

x_test=test.drop('y-variable',axis =1)
y_test=test['y-variable']


x=x_test.as_matrix()
y=y_test.as_matrix()

reg = linear_model.LinearRegression()
reg.fit(x_train,y_train)

reg.predict(x,y)
Mohamed Ali JAMAOUI
  • 14,275
  • 14
  • 73
  • 117
GD_N
  • 153
  • 1
  • 2
  • 13

3 Answers3

28

Use reg.predict(x). You don't need to provide the y values to predict. In fact, the purpose of training the machine learning model is to let it infer the values of y given the input parameters in x.

Also, the documentation of predict here explains that predict expects only x as a parameter.

The reason why you get the error:

predict() takes 2 positional arguments but 3 were given

is because, when you call reg.predic(x), python will implicitly translate this to reg.predict(self,x), that's why the error is telling you that predict() takes 2 positional arguments. The way you call predict, reg.predict(x,y), will be translated to reg.predict(self,x,y) thus 3 positional arguments will be used instead of 2 and that explains the whole error message.

Mohamed Ali JAMAOUI
  • 14,275
  • 14
  • 73
  • 117
  • @GD_N You should be able to accept an answer regardless - click the green check mark to accept an answer (not the same as voting on it). That will close the question and make it more useful for others. Please see [What should I do when someone answers my question?](https://stackoverflow.com/help/someone-answers) – charlesreid1 Oct 13 '17 at 03:12
  • 2
    @charlesreid1 did that. – GD_N Oct 13 '17 at 20:21
1

When you are testing over the test set, it is assumed you don't have the labels for it. You are testing to see how well your model can generalize, and hence you compare the predictions with the real labels. When you want to predict, you use only your X variable(s).

Mohamed Ali JAMAOUI
  • 14,275
  • 14
  • 73
  • 117
pissall
  • 7,109
  • 2
  • 25
  • 45
0

I think you are getting confused between reg.predict() and reg.score(), the former is a method which is used for making predictions on the data using the model which is trained using the data. It only takes your features/independent variables X and the object itself self (which is taken care internally) as inputs and gives you the corresponding predicted target/dependent variable Y, which can be later compared with the actual values of the target variable and evaluate the performance of the model. However, if you wish to do the model evaluation it in a single step you can use reg.score() method which takes both your X and Y as inputs and computes the corresponding evaluation measure (R^2 or accuracy depending on the problem at hand). Please refer to sklearn.linear_model.LinearRegression for more information.

Also, these methods are common for most of the supervised learning models in sklearn.

Parthasarathy Subburaj
  • 4,106
  • 2
  • 10
  • 24