4

I have a question regarding the extraction of the (raw) model matrix of random effects from models fitted with lmer (lme4) in R. More specifically, I want to obtain a data frame or a matrix that contains all variables that are involved in random effects terms. The matter is further complicated because some entries of that matrix are zero.

I usually extracted these matrices by accessing the sparse model matrix (Zt) via getME, after which I converted it to a regular matrix via its dimensions (see below). However, this leads to problems whenever the (raw) model matrix contains zeros because Zt only contains the nonzero elements.

Here is an example, a simple mixed effects model where x1 is normal and x2 contains five values that are exactly zero:

id <- rep(1:20,each=5)
y <- rnorm(100)
x1 <- rnorm(100)
x2 <- c(rep(0,5),rnorm(95))
df <- data.frame(id,x1,x2,y)

I fit two models using lmer, one with x1 the other with x2 as a predictor:

library(lme4)    
m1 <- lmer(y~1+x1+(1+x1|id), data=df)
m2 <- lmer(y~1+x2+(1+x2|id), data=df)

Here, I access the Zt slot of the fitted model object. The code below demonstrates that Zt doesn't contain the zero values in x2. As a result, my very simple conversion into a regular matrix throws an error.

# length okay
length(getME(m1,"Zt")@x)
# model matrix okay
mm1 <- matrix(getME(m1,"Zt")@x, ncol=2, byrow=T)

# too short
length(getME(m2,"Zt")@x)
# gives error on model matrix
mm2 <- matrix(getME(m2,"Zt")@x, ncol=2, byrow=T)

Here is what I thought I can do instead. lmer seems to save the raw matrices as well, which appears to work well as long as there is only one cluster variable.

# seems to work okay
mm3 <- getME(m2,"mmList")[[1]]

However, the mmList slot is poorly documented online, and I barely find mentioning that people use it for programming. Accessing Zt seems by far the more common option.

Is it possible to construct the model matrix of random effects from Zt even if the raw model matrix contains zeros? If not, then what should I expect from mmList?

SimonG
  • 4,701
  • 3
  • 20
  • 31
  • Can you specify **why** you want the full matrix instead of the sparse matrix? It seems the `Matrix` package (ie-sparse matrix) is more useful as it generalizes more easily... I've run into problems / memory issues trying to avoid `Matrix` in the past when working with large datasets – alexwhitworth Jul 08 '15 at 15:41
  • As I wrote in my question, I need the design matrix for the random effects in a given model, that is, the `(N*q)` model matrix containing the predictor variables associated with random effects. The `Zt` slot doesn't give exactly that, making it necessary to transform `Zt`, which is hindered by the handling of structural zeros (see Ben's answer below). The matrix is broken up into smaller matrices afterwards, which is why I don't run into memory problems, even with "large" data sets, as far as that goes in my field. – SimonG Jul 08 '15 at 18:19
  • I understand that you "need" the design matrix. I asked why that is the case (not whether that is the case). You've simple restated your need for it here. My comment was to suggest that perhaps the sparse matrix is sufficient if you learn to use the sparse matrices. Perhaps it isn't. If you don't wish to expand on your request, that's perfectly fine. You seem to have gotten a workable solution from Dr. Bolker – alexwhitworth Jul 20 '15 at 22:21
  • Using the sparse matrix may be more computationally efficient, and I may try that in the future. However, as I explained in my last comment, memory problems are unlikely to happen in my case. Therefore, Ben's suggestion on `mmList` perfectly answers my question. – SimonG Jul 20 '15 at 23:52

1 Answers1

3

If mmList is there, then it's not going away (however poorly documented it may be -- feel free to suggest documentation improvements ...). How about

do.call(cbind,getME(m2,"mmList"))

(which would seem to generalize correctly for multi-term models)?

I agree that it's a bit of a pain that Zt doesn't distinguish correctly between structural and non-structural zeroes -- it might be possible to change the underlying code to make this work if it were sufficiently important, but I think it would be hard enough that we'd need a pretty compelling use case ...

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Hi Ben! This is a very elegant solution, and I am happy to learn that `mmList` will scale to larger problems. I cannot really suggest an improvement for the documentation, but I asked this question online so people can find a reference for using `mmList` instead of `Zt`. – SimonG Jun 19 '15 at 09:34