I'm having trouble matching R's linear model confidence and prediction intervals for point estimates:
set.seed(3)
x <- rnorm(392, 20, 5)
y <- 2*x + 3 + rnorm(392, sd=3)
lm.fit <- lm(y~x)
(est <- as.vector(matrix(c(1,20),nrow=1,ncol=2) %*% lm.fit$coefficients))
(ci <- est + c(-1,1) * qt(.975, df=lm.fit$df.residual)*sqrt(var(lm.fit$residuals)/(lm.fit$df.residual)))
(pred <- predict(lm.fit,data.frame(x=(c(20))), interval="confidence", se.fit = TRUE))
sqrt(var(lm.fit$residuals)/(lm.fit$df.residual))
pred$se.fit
If you run the code above, you'll see that the point estimates are the same, but I get slightly different standard errors and therefore confidence intervals that are slightly off. How does R match the standard error?
I've seen in the documentation for predict.lm() (below) that R uses pred.var in the predict.lm function, but pred.var uses res.var, and doesn't say where res.var comes from. res.var also does not match the variance of the residuals:
var(lm.fit$residuals)
(http://127.0.0.1:25345/library/stats/html/predict.lm.html)
Thanks!