I know the answer has been chosen for this post. I still wish to point out how to specify a correct error term/random effect when fitting a aov
or lmer
model to a multi-way repeated-measures data. I assume that both independent variables (IVs) are fixed, and are crossed with each other and with subjects, meaning all subjects are exposed to all combinations of the IVs. I am going to use data taken from Kirk’s Experimental Deisign: Procedures for the Behavioral Sciences (2013).
library(lme4)
library(foreign)
library(lmerTest)
library(dplyr)
file_name <- "http://www.ats.ucla.edu/stat/stata/examples/kirk/rbf33.dta" #1
d <- read.dta(file_name) %>% #2
mutate(a_f = factor(a), b_f = factor(b), s_f = factor(s)) #3
head(d)
## a b s y a_f b_f s_f
## 1 1 1 1 37 1 1 1
## 2 1 2 1 43 1 2 1
## 3 1 3 1 48 1 3 1
## 4 2 1 1 39 2 1 1
## 5 2 2 1 35 2 2 1
In this study 5 subjects (s) are exposed to 2 treatments - type of beat (a) and training duration (b) - with 3 levels each. The outcome variable is the attitude toward minority. In #3 I made a, b, and s into factor variables, a_f, b_f, and s_f. Let p and q be the numbers of levels for a_f and b_f (3 each), and n be the sample size (5).
In this example the degrees of freedom (dfs) for the tests of a_f, b_f, and their interaction should be p-1=2, q-1=2, and (p-1)*(q-1)=4, respectively. The df for the s_f error term is (n-1) = 4, and the df for the Within (s_f:a_f:b_f) error term is (n-1)(pq-1)=32. So the correct model(s) should give you these dfs.
Using aov
Now let’s try different model specifications using aov
:
aov(y ~ a_f*b_f + Error(s_f), data=d) %>% summary() # m1
aov(y ~ a_f*b_f + Error(s_f/a_f:b_f), data=d) %>% summary() # m2
aov(y ~ a_f*b_f + Error(s_f/a_f*b_f), data=d) %>% summary() # m3
Simply specifying the error as Error(s_f)
in m1 gives you the correct dfs and F-ratios matching the values in the book. m2 also gives the same value as m1, but also the infamous “Warning: Error() model is singular”. m3 is doing something strange. It is further partitioning Within residuals in m1 (634.9) into residuals for three error terms: s_f:a_f (174.2), s_f:b_f (173.6), and s_f:a_f:b_f (287.1). This is wrong, since we would not get three error terms when we run a 2-way between-subjects ANOVA! Multiple error terms are also contrary to the point of using block factorial designs, which allows us to use the same error term for the test of A, B, and AB, unlike split-plot designs which requires 2 error terms.
Using lmer
How can we get the same dfs and F-values using lmer? If your data is balanced, the Kenward-Roger approximation used in the lmerTest
will give you exact dfs and F-ratio.
lmer(y ~ a_f*b_f + (1|s_f), data=d) %>% anova() # mem1
lmer(y ~ a_f*b_f + (1|s_f/a_f:b_f), data=d) %>% anova() # mem2
lmer(y ~ a_f*b_f + (1|s_f/a_f*b_f), data=d) %>% anova() # mem3
lmer(y ~ a_f*b_f + (1|s_f:a_f:b_f), data=d) %>% anova() # mem4
lmer(y ~ a_f*b_f + (a_f*b_f|s_f), data=d) %>% anova() # mem5
Again simply specifying the random effect as (1|s_f)
give you the correct dfs and F-ratios (mem1). mem2-5 did not even give results, presumably the numbers of random effects it needed to estimate were greater than the sample size.