1

This question builds on Extract prediction band from lme fit but with non-linear mixed models.

I have a dataset of response values that are grouped by "entry." I used an AIC-model selection procedure to test which type of model (linear, logarithmic, exponential, etc.) best represented the relationship between the response and predictor. Now I want to plot the fitted values for the data in each entry, and then across entries. I also want to plot the accompanying confidence band on the overall trend--see code provided on Ben Bolker's blog and the post above (I understand this is some finesse to the interpretation but that is another post). The latter is where I'm running into problems--see this example code:

#Load nlme
library(nlme)

#Create data frame
set.seed(6)
df=data.frame(y=c(1:5+runif(5,0,1),21:25+runif(5,1,5)),
              x=rep(1:5,2),entry=rep(letters[1:2],each=5))

#Group data by entry
df=groupedData(y~x|entry,data=df)

#Build model
mod=nlme(y~a+b*x,fixed=a+b~1,random=~a+b~1,start=c(a=1,b=1),data=df)

#Create data frame for predictions
pred=do.call(rbind,lapply(rownames(coef(mod)),function(i) {
  data.frame(x=seq(0.1,max(df[df$entry==i,"x"]),0.1),entry=i) } ) )
#Grab coefficients from model for a and b
a=coef(mod)[match(pred$entry,rownames(coef(mod))),1]
b=coef(mod)[match(pred$entry,rownames(coef(mod))),2]
#Grab fixed coefficients for overall trend and fitted values for individual entries using a and b above
pred$fixed=fixef(mod)[1]+fixef(mod)[2]*pred$x 
pred$fitted=a+pred$x*b

#Get SEs on predictions using code from Ben Bolker's website
predvar=diag(as.matrix(data.frame(a,b)) %*% vcov(mod)%*% t(as.matrix(data.frame(a,b))))
pred$fixed.SE=sqrt(predvar)
pred$fixed.SE2=sqrt(predvar+mod$sigma^2)

#Plot
ggplot()+geom_point(data=df,aes(x=x,y=y),col="grey50",alpha=0.5)+
  geom_line(data=subset(pred,max(df$x)>x & x>1),aes(x=x,y=fitted,group=entry),col="grey50")+
  geom_ribbon(data=subset(pred,max(df$x)>x & x>1),aes(x=x,ymin=fixed-2*fixed.SE,ymax=fixed+2*fixed.SE),alpha=0.5,fill="grey10")+
  geom_ribbon(data=subset(pred,max(df$x)>x & x>1),aes(x=x,ymin=fixed-2*fixed.SE2,ymax=fixed+2*fixed.SE2),alpha=0.3,fill="grey50")+
  geom_line(data=subset(pred,max(df$x)>x & x>1),aes(x=x,y=fixed),col="black",lwd=1.1)

The resulting plot looks like this, with the confidence band bouncing back and forth:

enter image description here

I suspect I have gone awry somewhere along the line (perhaps in the matrix multiplication?). Any help is appreciated, including suggestions on whether this is a good idea!

Community
  • 1
  • 1
jslefche
  • 4,379
  • 7
  • 39
  • 50
  • 2
    you can fix most of your plotting problems by adding `group=entry` to your aesthetics list for `geom_ribbon`. (I would also recommend setting `predsub <- subset(pred,max(df$x)>x & x>1)` and substituting `data=predsub` rather than repeating all this stuff every time.) However, this leads to giant confidence intervals on the `entry="a"` group, probably because your model is a bit weird/overfitted ... – Ben Bolker May 14 '13 at 21:41
  • (i.e. by weird/overfitted I mean the fact that you are treating `a` and `b` as random-effect parameters, varying at the level of `entry`, but there are only two levels of that grouping variable ...) – Ben Bolker May 14 '13 at 21:51
  • Hi Ben, thanks for responding! I'm glad to know that the weird plot was not due to an error in the math (e.g., that I got the right design matrix, etc.). I'm afraid the simplified example data may have generated a wonky model, but for my real dataset, entry does indeed represent a random variable. Thanks for also not pointing out that I essentially fit a linear model when asking about non-linear ones ;). I'll go ahead and mark this as answered – jslefche May 15 '13 at 00:36
  • 1
    if I have a chance I will post an answer based on my comment above. Feel free to post one yourself, though. – Ben Bolker May 15 '13 at 01:13

0 Answers0