3

I'm trying to understanding polynomial fitting with R. From my research on the internet, there apparently seems to be two methods. Assuming I want to fit a cubic curve ax^3 + bx^2 + cx + d into some dataset, I can either use:

lm(dataset, formula = y ~ poly(x, 3))

or

lm(dataset, formula = y ~ x + I(x^2) + I(x^3))

However, as I try them in R, I ended up with two different curves with complete different intercepts and coefficients. Is there anything about polynomial I'm not getting right here?

chsk
  • 113
  • 6
James Ngo
  • 39
  • 2
  • 2
    Different coefficients... probably. Almost certainly in fact. But different predicts? Unlikely. You need to show full code if you expect a careful consideration of your claim. – IRTFM Nov 25 '19 at 01:08
  • 2
    The predictions are almost certainly the same. If you want poly to mimic using I then use the raw=TRUE parameter. – Dason Nov 25 '19 at 01:33

2 Answers2

2

This comes down to what the different functions do. poly generates orthonormal polynomials. Compare the values of poly(dataset$x, 3) to I(dataset$x^3). Your coefficients will be different because the values being passed directly into the linear model (as opposed to indirectly, through either the I or poly function) are different.

As 42 pointed out, your predicted values will be fairly similar. If a is your first linear model and b is your second, b$fitted.values - a$fitted.value should be fairly close to 0 at all points.

Daniel V
  • 1,305
  • 7
  • 23
0

I got it now. There seems to be a difference between R computation of raw polynomial vs orthogonal polynomial. Thanks, everyone for the help.

James Ngo
  • 39
  • 2