2

If I recall right, there isn't an exact way to extract lm() coefficients from ggplot's geom_smoth() function. I've tried modelling the output without ggplot, but I keep getting very different answers.

An example with the mpg dataset:

ggplot(mpg, aes(displ, hwy)) +
  geom_point() +
  geom_smooth(method='lm', formula=y~poly(x,2))

test1

Instead of the default loess model from geom_smooth() I'm actively forcing a polynomial of degree 2 so I can build an equation and solve for its minimum. The output looks like it has a minimum around displ=5.75 or so.

Yet, when I try to do the modelling outside of ggplot:

test <- lm(mpg$hwy ~ poly(mpg$displ,2))
polyfun <- function(x) {coef(test)[1] + coef(test)[2]*x + coef(test)[3]*(x)^2 }
curve(polyfun, from=0, to=7)
optimize(polyfun, interval = c(0, 7), maximum = F)

I get a crazily different answer:

test2

with the result from optimize being: $minimum [1] 1.308562. I'm pretty sure the form of my equation using the model coefficients is right, but the plotted equation is very wrong. Is there something obvious that I'm missing?

AI52487963
  • 1,253
  • 2
  • 17
  • 36
  • 1
    Just use `predict` on your `lm` object instead of rolling your own `polyfun`. – Gregor Thomas Mar 01 '17 at 21:39
  • Also, you might consider renaming your question. The title sounds like it's about how to extract a model fit from ggplot and geom_smooth (which would be a duplicate of [this question](http://stackoverflow.com/q/9789871/903061)), but really it's about how to make predictions from a model with polynomial terms. – Gregor Thomas Mar 01 '17 at 21:42
  • Good point, title edited. – AI52487963 Mar 01 '17 at 21:43
  • 1
    Does this question: [poly() in lm(): difference between raw vs. orthogonal](http://stackoverflow.com/q/29999900/903061) answer your question? If so, I will mark as duplicate, but if you need more I can post an answer. – Gregor Thomas Mar 01 '17 at 22:02
  • Yes that was the answer, thanks Gregor! By adding the `raw=T` part to `poly()` I got the result I was after. Feel free to mark as duplicate. – AI52487963 Mar 01 '17 at 22:22
  • 1
    I would note that orthogonal polynomials are preferred for inference. And the best way to get predictions is to just use `predict`. Rather than your custom function something like this should work well: `test2 <- lm(hwy ~ poly(displ,2), data = mpg); pred = data.frame(displ = seq(0, 7, length.out = 100)); pred$fitted = predict(test2, newdata = pred)` (and it works whether or not there is a polynomial). – Gregor Thomas Mar 01 '17 at 22:44

0 Answers0