1

I am trying to get a confidence interval for my response in a poisson regression model. Here is my data:

X <- c(1,0,2,0,3,1,0,1,2,0)
Y <- c(16,9,17,12,22,13,8,15,19,11)

What I've done so far: (i) read my data (ii) fit a Y by poisson regression with X as a covariate

model <- glm(Y ~ X, family = "poisson", data = mydata)

(iii) use predict()

predict(model,newdata=data.frame(X=4),se.fit=TRUE,interval="confidence",level=0.95, type = "response")

I was expecting to get "fit, lwr, upr" for my response but I got the following instead:

$fit
       1 
30.21439 

$se.fit
       1 
6.984273 

$residual.scale
[1] 1

Could anyone offer some suggestions? I am new to R and struggling with this problem for a long time. Thank you very much.

PPG
  • 13
  • 3
  • This link may solve your problem http://www.sthda.com/english/articles/40-regression-analysis/166-predict-in-r-model-predictions-and-confidence-intervals – Ravi Saroch Mar 18 '21 at 06:22
  • Does this answer your question? [How does predict.lm() compute confidence interval and prediction interval?](https://stackoverflow.com/questions/38109501/how-does-predict-lm-compute-confidence-interval-and-prediction-interval) – Ravi Saroch Mar 18 '21 at 06:24
  • You should not be asking statistical methods questions here. Instead post to stats.stackexchange.com after first searching to see if it’s been answered there. – IRTFM Mar 18 '21 at 06:25

1 Answers1

2

First, the function predict() that you are using is the method predict.glm(). If you look at its help file, it does not even have arguments 'interval' or 'level'. It doesn't flag them as erroneous because predict.glm() has the (in)famous ... argument, that absorbs all 'extra' arguments. You can write confidence=34.2 and interval="woohoo" and it still gives the same answer. It only produces the estimate and the standard error.

Second, one COULD then take the fit +/- 2*se to get an approximate 95 percent confidence interval. However, without getting into the weeds of confidence intervals, pivotal statistics, non-normality in the response scale, etc., this doesn't give very satisfying intervals because, for instance, they often include impossible negative values.

So, I think a better approach is to form an interval in the link scale, then transform it (this is still an approximation, but probably better):

X <- c(1,0,2,0,3,1,0,1,2,0)
Y <- c(16,9,17,12,22,13,8,15,19,11)
model <- glm(Y ~ X, family = "poisson")
tmp <- predict(model, newdata=data.frame(X=4),se.fit=TRUE, type = "link")
exp(tmp$fit - 2*tmp$se.fit)
       1 
19.02976
exp(tmp$fit + 2*tmp$se.fit)
   1 
47.97273 
user11599
  • 326
  • 1
  • 5