0

I'm doing a survey in which I get public available data and re-run the statistics the authors did in the first place. I came across this article that ran nested mixed models with the lme4 package. When I ran the same models, I get the same estimates, with the same CI ranges, except for the intercept. I've tried scaling the continuous predictor, and it only made it worse. I tried fitting the model with REML being TRUE or FALSE, and it also didn't help. How and why can this be happening?? Hint: The R versions differ, and maybe the packages versions, but I don't think my intercept estimates should go from 13.39 (author's result) to 5.47 (my result).

Edit: The data set is quite large, so I'll post just the heading

 Species      Island Lizard_ID Headbob_duration_sec Background_speed_mm.per.sec Average_background_speed_mm.per.sec Habitat_light.log
1 Anolis_cooki Puerto_Rico      P232             6.300000                    8.933370                            8.933370          2.082839
2 Anolis_cooki Puerto_Rico      P233             2.133333                    8.518135                            9.331636          2.541373
3 Anolis_cooki Puerto_Rico      P233             4.766667                   10.076557                            9.331636          2.541373
4 Anolis_cooki Puerto_Rico      P233             7.133333                    9.641858                            9.331636          2.541373
5 Anolis_cooki Puerto_Rico      P233             5.466667                    9.347104                            9.331636          2.541373
6 Anolis_cooki Puerto_Rico      P233             6.000000                    9.074527                            9.331636          2.541373

Here is the model the authors (and I) did:

library(lme4)

modDHB<-lmer(Headbob_duration_sec~Average_background_speed_mm.per.sec+
Habitat_light.log+
Island+(1|Lizard_ID)+(Average_background_speed_mm.per.sec+Habitat_light.log|Species),
na.action=na.omit,control=lmerControl(optimizer="bobyqa"))

The model predicts the duration of a lizard behavior (head bob) as a function of background speed, environment light, and island the animal was observed. It uses lizard species as a random factor with lizard ID nested within species.

I'm not sure which optimizer the authors used, but I used bobyqa to prevent conversion problems.

Here is my output:

 summary(modDHB)
Linear mixed model fit by REML ['lmerMod']
Formula: Headbob_duration_sec ~ Average_background_speed_mm.per.sec +  
    Habitat_light.log + Island + (1 | Lizard_ID) + (Average_background_speed_mm.per.sec +  
    Habitat_light.log | Species)
Control: lmerControl(optimizer = "bobyqa")

REML criterion at convergence: 16746

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.4199 -0.3470 -0.0370  0.1199  9.7081 

Random effects:
 Groups    Name                                Variance  Std.Dev. Corr       
 Lizard_ID (Intercept)                         1.412e+01 3.75754             
 Species   (Intercept)                         7.099e+00 2.66442             
           Average_background_speed_mm.per.sec 1.444e-04 0.01202  -1.00      
           Habitat_light.log                   4.741e+00 2.17739  -0.74  0.74
 Residual                                      4.081e+01 6.38801             
Number of obs: 2490, groups:  Lizard_ID, 429; Species, 16

Fixed effects:
                                    Estimate Std. Error t value
(Intercept)                          5.47746    1.48225   3.695
Average_background_speed_mm.per.sec  0.05186    0.06082   0.853
Habitat_light.log                   -1.41167    0.73028  -1.933
IslandPuerto_Rico                    7.91151    1.27021   6.228

Correlation of Fixed Effects:
            (Intr) A___.. Hbtt_.
Avrg_bc__.. -0.356              
Hbtt_lght.l -0.597 -0.033       
IslndPrt_Rc -0.628  0.009  0.065
convergence code: 0
boundary (singular) fit: see ?isSingular

As I said, everything is fine, except for the intercept estimate, which should be 13.39.

Leonardo
  • 3
  • 3
  • 2
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Apr 03 '20 at 18:51
  • please provide a sample of your dataset so we can reproduce your error – StupidWolf Apr 03 '20 at 19:00
  • I added more details. Hopefully that helps. – Leonardo Apr 03 '20 at 19:44

1 Answers1

0

Some calculations have a probabilistic components, so solutions can be different, for example if starting values are differing.

smurfit89
  • 327
  • 5
  • 17