Your data are unbalanced in a way that makes the fixed-effect model rank-deficient (or multicollinear, if you prefer). When you include X5/X6
you are stating that you want to estimate effects for all combinations of X5
and X6
. However:
with(dd, table(VarX6,VarX5))
VarX5
VarX6 A B H IND Q S T
0 2 9 94 155 0 1 15
1 0 0 0 0 8 0 0
Only VarX5=Q
is ever measured at the VarX6=1
level, and it's never measured at the VarX6=0
level. This means the VarX6
variable, and its interaction with VarX5
, are redundant information.
As pointed out in the comments, if you run this in lme4::lmer()
it will automatically drop the redundant columns for you, with a message:
library(lme4)
m2 <- lmer(log(Resp)~log(VarX1)+log(VarX2)+(VarX3)+(VarX4)+
VarX5/VarX6 + (1|VarCat),
dd, REML=TRUE)
fixed-effect model matrix is rank deficient so dropping 7 columns / coefficients
You can find out which columns it dropped via attr(getME(m2,"X"), "col.dropped")
.
Alternatively, if you fit it in lm()
(I know you want to fit a mixed model, but this is a good diagnostic) you'll see that it doesn't complain, but it automatically sets all of the redundant coefficients to NA
:
m3 <- lm(log(Resp)~log(VarX1)+log(VarX2)+(VarX3)+(VarX4)+
VarX5/VarX6, data=dd)
coef(m3)
(Intercept) log(VarX1) log(VarX2) VarX3 VarX4
0.46921538 0.79476848 -0.45769296 1.85386835 -2.78321092
VarX5B VarX5H VarX5IND VarX5Q VarX5S
-0.04677216 0.21896140 0.24584351 -2.00226719 0.32677006
VarX5T VarX5A:VarX61 VarX5B:VarX61 VarX5H:VarX61 VarX5IND:VarX61
0.17474369 NA NA NA NA
VarX5Q:VarX61 VarX5S:VarX61 VarX5T:VarX61
NA NA NA
This question is very similar to Singularity in backsolve at level 0, block 1 in LME model . When you have unbalanced designs like this, "what to do about it" is not a question with a single simple answer.
- you could remove terms from the model yourself (e.g. in this case you can't really estimate anything about
VarX6
, since it is completely redundant with VarX5
, so replace VarX5/VarX6
in your model with VarX5
.
- you could use a function such as
lmer
that can automatically remove terms for you
What you can't do is actually estimate VarX5/VarX6
- your design just doesn't include that information. It's a little like saying "I want to estimate the effect of car colour on speed, but I only measured red cars".