1

I am trying to plot an interaction between two continuous variables in R. However, my data is multilevel (people nested within days) so I need to account for the nested structure of my data when I am graphing it. I analyze my data using the lme4 library to account for the nested structure, but I'm having a hard time figuring out how to graph it.

## example data
spin = runif(600, 1, 24)
reg = runif(600, 1, 15)
ID = rep(c("1","2","3","4","5", "6", "7", "8", "9", "10"))
day = rep(1:30, each = 10)
testdata <- data.frame(
  spin, reg, ID, day)
testdata$fatigue <- testdata$spin*testdata$reg/10*rnorm(30, mean=3, sd=2)

Here I have my independent variables of spin and reg, dependent variable of fatigue, and people (ID) nested within days. I run my model below.

## running my multilevel model with lme4
library(lme4)
m1 <- lmer(fatigue ~ spin * reg + ( 1 | ID), data = testdata, REML = T)
(m1)
confint(m1, test = "Chisq")

Let's assume I have an interaction between spin and reg. I need to put my continuous variable into a categorical variable in order to plot it.

So I create categorical variable based on one of my continuous variables. Here I pick spin. Note: not sure if this code below is quite right for what I want. Might have to do standard error? Also doesn't take my nested data structure into account, but not sure what to do otherwise.

x <- mean(testdata$spin, na.rm = T)
print(x)
y <- sd(testdata$spin, na.rm = T)
print(y)

testdata$SpinLevel[testdata$spin > x+y] <- "High"
testdata$SpinLevel[testdata$spin > x-y & testdata$spin <= x+y] <- "Mean"
testdata$SpinLevel[testdata$spin <= x-y] <- "Low"

rm(x,y)

Based on what I've found online, I can create a basic plot to show effects. BUT doesn't take nested structure into account (people -- the variable IDs-- are nested within days).

library(ggplot2)
ggplot(testdata,aes(reg,fatigue,linetype=SpinLevel))+
  geom_smooth(method="lm",se=FALSE)

This ggplot is good for interpreting the basic effect, but the lines are likely skewed because they don't take my nested structure of my data into account (people within days).

I can also graph my model with the effects library. This DOES take nested structure into account. Except the graph is not pretty and is in quartiles, and is very difficult to interpret. I would like it to be by high, mean, and low and all on the same graph. But I'm not sure how to do this.

library(effects)
plot(effect("spin*reg", m1), grid=TRUE, labels = T,
  xlevels=list(spin=quantile(testdata$spin, seq(0, 1, 0.25))))

Any ideas? Would be much appreciated.

sk17
  • 41
  • 1
  • 6
  • I see nothing in the model spec to indicate nesting within `day`. – IRTFM Jul 10 '16 at 19:00
  • http://stackoverflow.com/questions/9447329/how-to-plot-the-results-of-a-mixed-model – Hack-R Jul 10 '16 at 19:31
  • I would recommend the `broom` package as a replacement for `coefplot2` ... but I think the OP wants an effects plot, not a coefficient plot ... – Ben Bolker Jul 10 '16 at 20:38
  • Sorry, @42, I guess I use the term "nesting" in statistics sense--you can't assume responses are independent when your data is from the same people that are responding over several days. By using the sample people, I'm breaking the statistical assumption of independent observations, which is why I am using multilevel modeling. I included `day` in my code to just show that I'm using multiple people over multiple days, because that's how my data is formatted. @BenBolker-- you are right that I do want effects plots. – sk17 Jul 11 '16 at 20:13
  • So you are ignoring potential auto-correlation across adjacent or nearby days. Are your "days" so widely separated that this makes good physiologic sense? – IRTFM Jul 11 '16 at 20:32
  • @42, in this example model I'm only accounting the non-independence in regards to `ID`. I could run a model and account for `day` if I think it matters for my hypotheses, but I am not here. – sk17 Jul 13 '16 at 19:12

2 Answers2

2

Data setup:

set.seed(101)
spin = runif(600, 1, 24)
reg = runif(600, 1, 15)
ID = rep(c("1","2","3","4","5", "6", "7", "8", "9", "10"))
day = rep(1:30, each = 10)
testdata <- data.frame(spin, reg, ID, day)
testdata$fatigue <- testdata$spin*testdata$reg/10*rnorm(30, mean=3, sd=2)

Is ID really nested within day? Technically, that would suggest that individual 1 (ID=1) measured on day 1 represents a different person that ID=1 measured on day 2 ... ?

library(lme4)
m1 <- lmer(fatigue ~ spin * reg + ( 1 | ID),
           data = testdata, REML = TRUE)
confint(m1, method = "Wald", parm="beta_")
## instead of test="Chisq", which doesn't work
##                    2.5 %    97.5 %
## (Intercept) -13.44726318 7.4959080
## spin         -0.04751327 1.2328254
## reg          -0.86763792 1.1550787
## spin:reg      0.11263238 0.2541709

Why isn't day in the model ... ?

Set up prediction data:

## midpoints of bin
 spinvals <- quantile(testdata$spin,seq(0,1,length=5))[2:4]
 pframe <- with(testdata,
           expand.grid(ID=unique(ID),
                       reg=seq(min(reg),max(reg),length.out=51),
                       spin=spinvals))
 pframe$fatigue <- predict(m1,newdata=pframe)
 pframe$spinFac <- factor(pframe$spin,levels=spinvals)
 ## explicit factor() to prevent alphabetization of levels

 library(ggplot2); theme_set(theme_bw())
 g0 <- ggplot(pframe,aes(reg,fatigue,colour=spinFac))+
     geom_line(aes(group=interaction(spinFac,ID)))

 ## bins for cutting testdata into 3 levels (min, 0.33,0.66, max)
 ## label bins by midpoints
 spincuts <- quantile(testdata$spin,seq(0,1,length=4))
 testdata$spinFac <- cut(testdata$spin,
            spincuts,labels=spinvals)

I'm not quite sure why this is flipping the factor levels ...

 g0 + geom_point(data=testdata)

Here's an initial attempt to pull the required data out of the effects object:

library(effects)
ee <- effect("spin*reg", m1,
   xlevels=list(spin=spinvals))
eedat <- with(ee,data.frame(x,fatigue=fit,lwr=lower,upr=upper))
ggplot(eedat,aes(x=reg,y=fatigue,colour=factor(spin)))+
    geom_line()+
    geom_ribbon(aes(group=spin,ymin=lwr,ymax=upr),colour=NA,
                            alpha=0.4)
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
1

I change the model slightly to make it reflect both ID and day.

How about this:

## example data
spin = runif(600, 1, 24)
reg = runif(600, 1, 15)
ID = rep(c("1","2","3","4","5", "6", "7", "8", "9", "10"))
day = rep(1:30, each = 10)
testdata <- data.frame(
spin, reg, ID, day)
testdata$fatigue <- testdata$spin*testdata$reg/10*rnorm(30, mean=3, sd=2)

## running my multilevel model with lme4
library(lme4)
m1 <- lmer(fatigue ~ spin * reg + ( 1 | ID/day), data = testdata, REML = T)
(m1)
confint(m1, test = "Chisq")

x <- mean(testdata$spin, na.rm = T)
print(x)
y <- sd(testdata$spin, na.rm = T)
print(y)

testdata$SpinLevel[testdata$spin > x+y] <- "High"
testdata$SpinLevel[testdata$spin > x-y & testdata$spin <= x+y] <- "Mean"
testdata$SpinLevel[testdata$spin <= x-y] <- "Low"

rm(x,y)

require(multicomp)
mp <- as.data.frame(confint(glht(m1))$confint)
tmp$Comparison <- rownames(tmp)
ggplot(tmp, aes(x = Comparison, y = Estimate, ymin = lwr, ymax = upr)) + geom_errorbar() + geom_point()

enter image description here

# or
library(multcomp)
tmp <- as.data.frame(confint(glht(m1))$confint)
tmp$Comparison <- rownames(tmp)
ggplot(tmp, aes(x = Comparison, y = Estimate, ymin = lwr, ymax = upr)) + geom_errorbar() + geom_point()

enter image description here

Also:

install.packages("coefplot2", # from this crackpot R coder named Bolker
                 repos="http://www.math.mcmaster.ca/bolker/R",
                 type="source") # I think he died a few years back
                                # jk Ben
library(coefplot2)
coefplot2(m1)
ggplot(tmp, aes(x = Comparison, y = Estimate, ymin = lwr, ymax = upr)) + geom_errorbar() + geom_point()

There are also some very interesting color plots in the answer by a person named Wes here.

Community
  • 1
  • 1
Hack-R
  • 22,422
  • 14
  • 75
  • 131