1

How to find the CI of a simple linear regression model? I found different ways to do it but the CIs plotted are different from each other.

Since I am not from a statistics background so I am not sure which one is the correct one. I only have a very basic understanding of CI and linear regression.

The first attempt, I just use the function confint() to plot the straight line.

plot(y~x, data=df)

abline(lm.model)

ci<-confint(lm.model, level=0.95)
abline(ci[1], ci[2])
abline(ci[3], ci[4])

Second attempt from Plotting a 95% confidence interval for a lm object

newx = seq(min(x),max(x),by = 0.05)
conf_interval <- predict(lm.model, newdata=data.frame(x=newx), interval="confidence",
                         level = 0.95)
plot(y~x, data=df, xlab="x", ylab="y", main="Regression")
abline(lm.model, col="lightblue")
lines(newx, conf_interval[,2], col="blue", lty=2)
lines(newx, conf_interval[,3], col="blue", lty=2)

It seems to me that they are all getting the same thing but I am not sure which one is the one I am looking for. Any help is appreciated. Many thanks.

user16971617
  • 457
  • 3
  • 14

1 Answers1

1

You may try using geom_smooth

library(dplyr)
library(ggplot2)
df %>%
  ggplot(aes(x,y)) +
  geom_point() +
  geom_smooth(method = "lm", level = 0.95)

enter image description here

Park
  • 14,771
  • 6
  • 10
  • 29
  • So the second attempt is the correct CI built? Then what is wrong about the first one ? – user16971617 Oct 29 '21 at 02:06
  • @user16971617 `confint` function returns CI of parameters, not for predicted values. For more information, see [confint function](https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/confint). Second attempt is correct. – Park Oct 29 '21 at 02:13
  • Sorry that I still don't understand the difference of CI of parameters and predicted values. Aren't they the same? – user16971617 Oct 29 '21 at 03:20
  • For example ci gives 2.5 % 97.5 % (Intercept) 5.27648921 7.03001947 x -0.06696724 -0.02342506 Which means we are 95% confident that x is between region of abline(ci[1], ci[2]) and abline(ci[3], ci[4]) – user16971617 Oct 29 '21 at 03:22
  • 1
    @user16971617 CI for `beta_0 hat`(`b0`) and `beta_1 hat`(`b1`) are `b0 +- t*sd(b0)`/`b1 +- t*sd(b1)` , for mean response is `y hat +- t * sqrt(MSE * (1/n +(x-xbar)^2/Sxx))` CI for new `x` is `y hat +- t * sqrt(MSE * (1+ 1/n +(x-x bar)^2/Sxx))`. Take a look at those components. They're pretty different. – Park Oct 29 '21 at 04:16