My dataframe "MyDataRisk" has 36692 rows for 1 binomial response variable (Risk), and 3 continuous + 1 categorical variables. Id defines study site identity. Here is a summary:
> summary(MyDataRisk)
id Landscape Road_width Risk
Min. : 1.00 Forest : 7214 Min. :3.800 Min. :0.0000
1st Qu.: 11.00 Double hedge : 4955 1st Qu.:5.500 1st Qu.:0.0000
Median : 31.00 Simple hedge : 3490 Median :6.000 Median :0.0000
Mean : 40.92 Perp_Hedge : 15433 Mean :6.005 Mean :0.1875
3rd Qu.: 66.00 Edge : 4020 3rd Qu.:6.400 3rd Qu.:0.0000
Max. :112.00 No_vegetation: 1580 Max. :7.700 Max. :1.0000
Vegetation_height Vegetation_Distance
Min. :-2.17260 Min. :-1.32359
1st Qu.:-0.54750 1st Qu.:-0.82262
Median :-0.08318 Median : 0.04941
Mean : 0.00000 Mean : 0.00000
3rd Qu.: 0.61329 3rd Qu.: 1.04935
Max. : 2.70271 Max. : 1.74702
I used the following glmmPQL to model my response variable:
Mod1 <- glmmPQL(Risk ~ Vegetation_height+Road_width+Vegetation_Distance+Landscape
+Vegetation_height:Road_width
+Vegetation_height:Vegetation_Distance
+Vegetation_height:Landscape
+Road_width:Vegetation_Distance
+Road_width:Landscape
+Vegetation_Distance:Landscape,
data=MyDataRisk,
family = binomial, random = ~ 1|id)
And I obtain the following error message:
iteration 1
Error in MEEM(object, conLin, control$niterEM) :
Singularity in backsolve at level 0, block 1
Now the value "No_vegetation" of the categorical variable "Landscape" has no variability in "Vegetation height" and "Vegetation distance". Which makes sense. The problem could come from here. If I choose to remove observations for "No_vegetation" the model works fine. However I would like to keep these observations since they are important in my design, and test interactions for all values of "Landscape" but for "No_vegetation".
I know that it is possible to modify the design matrix of the model, however I could not manage to understand if it would work in my case, and how to do it...