Distance between Linear Regression slope and data point

Question

I have fit a LinearRegression() model. What I want to do now is basically calculate the distance between some data points and the regression line.

My datapoints are two dimensional points (x, y)

My question is: How can I get the equation of the line from the LinearRegression() model?

Vivek Kalyanarangan · Answer 1 · 2018-03-06T12:41:19.343

After you have fit the model, you can either call the coef and intercept_ attributes to see what the coefficients and the intercept are respectively.

But this would involve writing a constructed formula for your model. My recommendation is once you build your model, make the predictions and score it against the true y values -

from sklearn.metrics import mean_squared_error
mean_squared_error(y_test, y_pred) # y_test are true values, y_pred are the predictions that you get by calling regression.predict()

If the goal is to calculate distances, you sklearn.metrics convenience functions instead of looking for the equation and hand-computing it yourself. The manual way to do that will be -

import numpy as np
y_pred = np.concatenate(np.ones(X_test.shape[0]), X_test) * np.insert(clf.coef_,0,clf.intercept_)
sq_err = np.square(y_pred - y_test)
mean_sq_err = np.mean(sq_err)

I'm basically looking for normal distance between the point and the line and obtain a score for it - since the data is 2D this is literally doing a poor man's PCA. — Kristijan, Mar 06 '18 at 12:47

score 1 · Accepted Answer · answered Mar 06 '18 at 12:29

From the documentation, use clf.coef_ for the weight vector(s) and clf.intercept_ for the bias:

coef_ : array, shape (n_features, ) or (n_targets, n_features)
Estimated coefficients for the linear regression problem. If multiple targets are passed during the fit (y 2D), this is a 2D array of shape (n_targets, n_features), while if only one target is passed, this is a 1D array of length n_features.

intercept_ : array Independent term in the linear model.

Once you have these, see here.

Somehow I missed this. Thanks. – Kristijan Mar 06 '18 at 12:36 — Kristijan, Mar 06 '18 at 12:36

Distance between Linear Regression slope and data point

2 Answers2