1

I fitted a model using the lmer() function (it works well). I have 11 explanatory variables. Three of them, if present in model, cause the step() function (from package lmerTest) to return the error: "Variables length differ (found on "...")" where "..." is the formula call. I don't have any NA values in the data: there are 600 rows and all three of the problematic variables (H, I, J) are factors.

My code is:

library(purrr) ## for rdunif() 
library(lmerTest)
data2 = as.data.frame(matrix(c(rdunif(600*7,1,5),
                         rdunif(600*3,0,1),
                         rdunif(600,1,9),
                         rep(c("a","b"),300)),
                       nrow = 600), byrow = FALSE)
names(data2) = c("A","B","C","D", "E","F","G","H","I","J","Z","M")
data2[,7:10] = lapply(data2[,7:10],factor)
data2[,c(1:6,11)] = lapply(data2[,c(1:6,11)],as.numeric)

mod1 = lmer(Z ~ A+B+C+D+E+F+G+
          #H+
          #I+
          #J+ 
          (1|M),data2)
step.mod1 = lmerTest::step(mod1) #it works
#
mod2 = lmer(Z ~ A+B+C+D+E+F+G+H+
          #I+
          #J+ 
          (1|M),data2)
step.mod2 = lmerTest::step(mod2) #it does not work and returns: Variables length differ (found on "A+B+C+D+E+F+G+")
mod3 = lmer(Z ~ A+B+C+D+E+F+G+H+I+J+ 
          (1|M),data2)
step.mod3 = lmerTest::step(mod3) #it does not work and returns: Variables length differ (found on "A+B+C+D+E+F+G+H+I+")

I know that this error is common when there are NAs, but what is the error in this case? How can I fix it?

abenci
  • 8,422
  • 19
  • 69
  • 134
  • What is your question? We have no idea what your data is or what it even means to say that 13 variables are problematic (other than that they have lengths that differ from the other 43 variables). It seems more like a statistical methodology question than a programming question. You need to decide on how to handle those variables. Perhaps with adequate details provided, you could ask on [statistics.se]. – John Coleman Jun 17 '20 at 20:59
  • If you want to keep the question here, please provide a [mcve]. See [How to make a great R reproducible example?](https://stackoverflow.com/q/5963269/4996248) for what this would mean in R. – John Coleman Jun 17 '20 at 21:06
  • 1
    I edited and added a reproducible example. – Tiago Gimenez Jun 17 '20 at 21:55
  • I tried to make your example reproducible but failed so far. If you use `matrix` to combine a bunch of numeric variables and a character variable, you end up coercing everything to character, which messes things up downstream. (Apologies if I misunderstood your intent: for example, I added "D" to the column names (did you mean to leave it out?) – Ben Bolker Jun 17 '20 at 22:03
  • I'm sure it's correct now. Sorry for the errors – Tiago Gimenez Jun 17 '20 at 23:04
  • are you getting any messages about singular fits? – Ben Bolker Jun 17 '20 at 23:22
  • I'm not. The summary of the model shows that everything is being estimate. – Tiago Gimenez Jun 17 '20 at 23:40
  • That's weird. When I run your first example (using `set.seed(101)` first for reproducibility) I get "boundary (singular) fit: see ?isSingular" – Ben Bolker Jun 17 '20 at 23:55

0 Answers0