2

Suppose I have the following code to fit a hyperbolic parabola:

# attach(mtcars)    
hp_fit <- lm(mpg ~ poly(wt, disp, degree = 2), data = mtcars)

Where wt is the x variable, disp is the y variable, and mpg is the z variable. (summary(hp_fit))$coefficients outputs the following:

   >(summary(hp_fit))$coefficients
                                     Estimate Std. Error    t value     Pr(>|t|)
    (Intercept)                     22.866173   3.389734  6.7457122 3.700396e-07
    poly(wt, disp, degree = 2)1.0  -13.620499   8.033068 -1.6955539 1.019151e-01
    poly(wt, disp, degree = 2)2.0   15.331818  17.210260  0.8908534 3.811778e-01
    poly(wt, disp, degree = 2)0.1   -9.865903   5.870741 -1.6805208 1.048332e-01
    poly(wt, disp, degree = 2)1.1 -100.022013 121.159039 -0.8255431 4.165742e-01
    poly(wt, disp, degree = 2)0.2   14.719928   9.874970  1.4906301 1.480918e-01

I do not understand how to interpret the varying numbers to the right of poly() under the (Intercept) column. What is the significance of these numbers and how would I construct an equation for the hyperbolic paraboloid fit from this summary?

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • It won't be too hard if you use raw polynomials by setting `raw = TRUE` inside `poly()`. Using orthogonal polynomials is better for the stability of your regression, but the equation will be much more painful to extract. Related reading is [Extracting orthogonal polynomial coefficients from R's poly function](https://stackoverflow.com/a/26729318/903061) and [How poly generates orthogonal polynomials](https://stackoverflow.com/a/39051154/903061). – Gregor Thomas Apr 30 '20 at 16:10

1 Answers1

1

When you compare

with(mtcars, poly(wt, disp, degree=2))
with(mtcars, poly(wt, degree=2))
with(mtcars, poly(disp, degree=2))

the 1.0 2.0 refer to the first and second degree of wt, and the 0.1 0.2 refer to the first and second degree of disp. The 1.1 is an interaction term. You may check this by comparing:

summary(lm(mpg ~ poly(wt, disp, degree=2, raw=T), data=mtcars))$coe
#                                         Estimate  Std. Error    t value     Pr(>|t|)
# (Intercept)                         4.692786e+01 7.008139762  6.6961935 4.188891e-07
# poly(wt, disp, degree=2, raw=T)1.0 -1.062827e+01 8.311169003 -1.2787937 2.122666e-01
# poly(wt, disp, degree=2, raw=T)2.0  2.079131e+00 2.333864211  0.8908534 3.811778e-01
# poly(wt, disp, degree=2, raw=T)0.1 -3.172401e-02 0.060528241 -0.5241191 6.046355e-01
# poly(wt, disp, degree=2, raw=T)1.1 -2.660633e-02 0.032228884 -0.8255431 4.165742e-01
# poly(wt, disp, degree=2, raw=T)0.2  2.019044e-04 0.000135449  1.4906301 1.480918e-01

summary(lm(mpg ~ wt*disp + I(wt^2) + I(disp^2) , data=mtcars))$coe[c(1:2, 4:3, 6:5), ]
#                  Estimate  Std. Error    t value     Pr(>|t|)
# (Intercept)  4.692786e+01 7.008139762  6.6961935 4.188891e-07
# wt          -1.062827e+01 8.311169003 -1.2787937 2.122666e-01
# I(wt^2)      2.079131e+00 2.333864211  0.8908534 3.811778e-01
# disp        -3.172401e-02 0.060528241 -0.5241191 6.046355e-01
# wt:disp     -2.660633e-02 0.032228884 -0.8255431 4.165742e-01
# I(disp^2)    2.019044e-04 0.000135449  1.4906301 1.480918e-01

This yields the same values. Note that I used raw=TRUE for comparison purposes.

jay.sf
  • 60,139
  • 8
  • 53
  • 110