Different Sum Sq and MSS using lme4::lmer and lmerTest::lmer

Question

I get sums of squares and mean sums of squares 10x higher when I use anova on lmerTest:: lmer compared to lme4:: lmer objects. See the R log file below. Note the warning that when I attach the lmerTest package, the stats::sigma function overrides the lme4::sigma function, and I suspect that it is this that leads to the discrepancy. In addition, the anova report now says that it is a Type III anova rather than the expected Type I. Is this a bug in the lmerTest package, or is there something about use of the Kenward-Roger approximation for degrees of freedom that changes the calculation of SumSQ and MSS and specification of the anova Type that I don't understand?

I would append the test file, but it is confidential clinical trial information. If necessary I can see if I can cobble up a test case.

Thanks in advance for any advice you all can provide about this.

> library(lme4)

Loading required package: Matrix

Attaching package: ‘lme4’

The following object is masked from ‘package:stats’:
sigma

> test100 <- lmer(log(value) ~ prepost * lowhi + (1|CID/LotNo/rep), 
REML = F, data = GSIRlong, subset = !is.na(value))

> library(lmerTest)
Attaching package: ‘lmerTest’
The following object is masked from ‘package:lme4’:
lmer
The following object is masked from ‘package:stats’:
step
Warning message:
replacing previous import ‘lme4::sigma’ by ‘stats::sigma’ when loading 
‘lmerTest’ 

> test200 <- lmer(log(value) ~ prepost * lowhi + (1|CID/LotNo/rep), 
REML = F, data = GSIRlong, subset = !is.na(value))

> anova(test100)
Analysis of Variance Table
              Df  Sum Sq Mean Sq  F value
prepost        1   3.956   3.956  18.4825
lowhi          1 130.647 130.647 610.3836
prepost:lowhi  1   0.038   0.038   0.1758

> anova(test200, ddf = 'Ken')
Analysis of Variance Table of type III  with  Kenward-Roger 
approximation for degrees of freedom

               Sum Sq Mean Sq NumDF  DenDF F.value    Pr(>F)    

prepost         37.15   37.15     1 308.04   18.68 2.094e-05 ***
lowhi         1207.97 1207.97     1 376.43  607.33 < 2.2e-16 ***
prepost:lowhi    0.35    0.35     1 376.43    0.17     0.676   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Update: Thanks, Ben. I did a little code archeology on lmerTest to see if I could dope out an explanation for the above anomalies. First, it turns out that lmerTest::lmer just submits the model to lme4::lmer and then relabels the result as an "mermodLmerTest" object. The only effect of this is to invoke versions of summary() and anova() from the lmerTest package rather than the usual defaults from base and stats. (These lmerTest functions are compiled, and I have not yet gone farther to look at the C++ code.) lmerTest::summary just adds three columns to the base::summary result, giving df, t value, and Pr. Note that lmerTest::anova, by default, computes a type III anova rather than a type I as in stats::anova. (Explanation of my second question above.) Not a great choice if one's model includes interactions. One can request a type I, II, or III anova using the type = 1/2/3 option.

However there are other surprises using the nlmeTest versions of summary and anova, as shown in the R console file below. I used lmerTest's included sleepstudy data so that this code should be replicable.

a. Note that "sleepstudy" has 180 records (with 3 variables)

b. The summaries of fm1 and fm1a are identical except for the added Fixed effects columns. But note that in the lmerTest::summary the ddfs for the intercept and Days are 1371 and 1281 respectively; odd given that there are only 180 records in "sleepstudy."

c. Just as in my original model above, the nlm4 anad nlmrTest versions of anova give very different values of Sum Sq and Mean Sq. (30031 and 446.65 respectively).

d: Interestingly, the nlmrTest versions of anova using Satterthwaite and Kenward-Rogers estimates of the DenDF are wildly different (5794080 and 28 respecitvely). The K-R value seems more reasonable.

Given the above issues, I am reluctant at this point to depend on lmerTest to give reliable p-values. Based on your (Doug Bates's) blog entry (https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html), I am using now (and recommending) the method from the posting by Dan Mirman (http://mindingthebrain.blogspot.ch/2014/02/three-ways-to-get-parameter-specific-p.html) in the final bit of code below to estimate a naive t-test p-value (assuming essentially infinite degrees of freedom) and a Kenward-Rogers estimate of df (using the R package 'pbkrtest' -- the same package used by lmerTest). I couldn't find R code to compute the Satterthwaite estimates. The naive t-test p-value is clearly anti-conservative, but the KR estimate is thought to be pretty good. If the two give similar estimates of p, then I think that one can feel comfortable with a p-value in the range of [naive t-test, KR estimate].

> library(lme4); library(lmerTest); library(pbkrtest); 
dim(sleepstudy)
[1] 180   3
> 
> fm1 <- lme4::lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
> fm1a <- lmerTest::lmer(Reaction ~ Days + (Days|Subject), sleepstudy)
> 
> summary(fm1)
Linear mixed model fit by REML ['lmerMod']
Formula: Reaction ~ Days + (Days | Subject)
   Data: sleepstudy
REML criterion at convergence: 1743.6
Scaled residuals: 
Min      1Q  Median      3Q     Max 
-3.9536 -0.4634  0.0231  0.4634  5.1793 

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Subject  (Intercept) 612.09   24.740       
          Days         35.07    5.922   0.07
 Residual             654.94   25.592       
Number of obs: 180, groups:  Subject, 18

Fixed effects:
            Estimate Std. Error t value
(Intercept)  251.405      6.825   36.84
Days          10.467      1.546    6.77

Correlation of Fixed Effects:
     (Intr)
Days -0.138
> summary(fm1a)
Linear mixed model fit by REML t-tests use Satterthwaite approximations to
  degrees of freedom [lmerMod]
Formula: Reaction ~ Days + (Days | Subject)
   Data: sleepstudy

REML criterion at convergence: 1743.6

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.9536 -0.4634  0.0231  0.4634  5.1793 

Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Subject  (Intercept) 612.09   24.740       
          Days         35.07    5.922   0.07
 Residual             654.94   25.592       
Number of obs: 180, groups:  Subject, 18

Fixed effects:
            Estimate Std. Error       df t value Pr(>|t|)    
(Intercept)  251.405      6.825 1371.100  302.06   <2e-16 ***
Days          10.467      1.546 1281.700   55.52   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
     (Intr)
Days -0.138
Warning message:
In deviance.merMod(object, ...) :
  deviance() is deprecated for REML fits; use REMLcrit for the REML
    criterion or deviance(.,REML=FALSE) for deviance calculated at the     REML fit
> 
> anova(fm1)
Analysis of Variance Table
     Df Sum Sq Mean Sq F value
Days  1  30031   30031  45.853

> anova(fm1a, ddf = 'Sat', type = 1)
Analysis of Variance Table of type I  with  Satterthwaite 
approximation for degrees of freedom
     Sum Sq Mean Sq NumDF   DenDF F.value    Pr(>F)    
Days 446.65  446.65     1 5794080  45.853 1.275e-11 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In deviance.merMod(object, ...) :
  deviance() is deprecated for REML fits; use REMLcrit for the REML     criterion or deviance(.,REML=FALSE) for deviance calculated at the REML fit

> anova(fm1a, ddf = 'Ken', type = 1)
Analysis of Variance Table of type I  with  Kenward-Roger 
approximation for degrees of freedom
     Sum Sq Mean Sq NumDF  DenDF F.value    Pr(>F)    
Days 446.65  446.65     1 27.997  45.853 2.359e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In deviance.merMod(object, ...) :
  deviance() is deprecated for REML fits; use REMLcrit for the REML     criterion or deviance(.,REML=FALSE) for deviance calculated at the REML fit
> 
> #  t.test
> coefs <- data.frame(coef(summary(fm1)))
> coefs$p.z <- 2 * (1 - pnorm(abs(coefs$t.value)))
> coefs
             Estimate Std..Error   t.value          p.z
(Intercept) 251.40510   6.824556 36.838311 0.000000e+00
Days         10.46729   1.545789  6.771485 1.274669e-11
> 
> #  Kenward-Rogers
> df.KR <- get_ddf_Lb(fm1, fixef(fm1))
> df.KR
[1] 25.89366
> coefs$p.KR <- 2 * (1 - pt(abs(coefs$t.value), df.KR))
> coefs
             Estimate Std..Error   t.value          p.z       p.KR
(Intercept) 251.40510   6.824556 36.838311 0.000000e+00 0.0000e+00
Days         10.46729   1.545789  6.771485 1.274669e-11 3.5447e-07

not sure. Type III vs Type I is as intended (and the interaction effect wouldn't be expected change due to this change in any case), and the `sigma` import warning shouldn't be a problem. Doesn't happen with very simple data, but I don't have a more complex example handy. Reproducible example? — Ben Bolker, Oct 17 '16 at 13:47

Different Sum Sq and MSS using lme4::lmer and lmerTest::lmer

0 Answers0