2

Here is the self-explanatory code for a toy example on which I was trying to run a second degree polynomial model (successfully), and later plot the corresponding line on the scatterplot:

import numpy as np
import matplotlib.pyplot as plt
import statsmodels.api as sm              
import statsmodels.stats.stattools as stools
import statsmodels.stats as stats
from   statsmodels.graphics.regressionplots import *
from   statsmodels.sandbox.regression.predstd import wls_prediction_std
from   statsmodels.formula.api import ols


import io
import requests

url = "https://raw.githubusercontent.com/RInterested/datasets/gh-pages/mtcars.csv"
contents = requests.get(url).content
mtcars = pd.read_csv(io.StringIO(contents.decode('utf-8')))

mtcars['wt_square']=mtcars['wt']**2
model = ols('mpg ~ wt + wt_square', data=mtcars).fit()
print(model.summary())
print(model.params)

plt.scatter(mtcars['wt'], mtcars['mpg'])

enter image description here

plt.scatter(mtcars['wt'], mtcars['mpg'])
x2 = np.arange(0, 6, 1) # I used only a few points to test.
y2 = np.polyval(model.params, x2)
print(x2)
print(y2)
plt.plot(x2, y2, label="deg=2")

I have been trying to follow this post. I think that my faulty code above is using the intercept to multiply times the squared independent variable, but I am not sure this is the actual problem. Here is the totally wrong output:

enter image description here

The equivalence in R (what I am after) would be:

fit <- lm(mpg ~ poly(wt,2,raw=T), mtcars)
plot(mpg ~ wt, mtcars)
lines(sort(mtcars$wt), fitted(fit)[order(mtcars$wt)], col='red', type='b') 

[![enter image description here][1]][1]

enter image description here

If the vector of valuses x2 had been used to generate this plot, the manual calculation in R would have been:

MM <- model.matrix(fit)
testing_points <- 0:5
tpsq <- testing_points^2
ones <- rep(1, length(testing_points))
testmat <- data.frame(ones, testing_points, tpsq)
as.matrix(testmat) %*% as.vector(fit$coef)

yielding:

49.93081
37.72156
27.85448
20.32958
15.14685
12.30630

Perhaps there is a way of arranging the linear algebra with Python.

Antoni Parellada
  • 4,253
  • 6
  • 49
  • 114

0 Answers0