Polynomial fitting with R using poly vs. I function

Question

I'm trying to understanding polynomial fitting with R. From my research on the internet, there apparently seems to be two methods. Assuming I want to fit a cubic curve ax^3 + bx^2 + cx + d into some dataset, I can either use:

lm(dataset, formula = y ~ poly(x, 3))

or

lm(dataset, formula = y ~ x + I(x^2) + I(x^3))

However, as I try them in R, I ended up with two different curves with complete different intercepts and coefficients. Is there anything about polynomial I'm not getting right here?

Different coefficients... probably. Almost certainly in fact. But different predicts? Unlikely. You need to show full code if you expect a careful consideration of your claim. — IRTFM, Nov 25 '19 at 01:08
The predictions are almost certainly the same. If you want poly to mimic using I then use the raw=TRUE parameter. — Dason, Nov 25 '19 at 01:33

score 2 · Answer 1 · answered Nov 25 '19 at 01:37

This comes down to what the different functions do. poly generates orthonormal polynomials. Compare the values of poly(dataset$x, 3) to I(dataset$x^3). Your coefficients will be different because the values being passed directly into the linear model (as opposed to indirectly, through either the I or poly function) are different.

As 42 pointed out, your predicted values will be fairly similar. If a is your first linear model and b is your second, b$fitted.values - a$fitted.value should be fairly close to 0 at all points.

score 0 · Answer 2 · answered Nov 25 '19 at 01:50

0

I got it now. There seems to be a difference between R computation of raw polynomial vs orthogonal polynomial. Thanks, everyone for the help.

answered Nov 25 '19 at 01:50

James Ngo

39
2

Polynomial fitting with R using poly vs. I function

2 Answers2