0

if have been looking around a little bit, but I cant a solution for my problem on here. When I am fitting a mixed model the output of the regression shows only one level for each factor. For example for "Behandlung" it only shows level "b" and not level "a". Same goes for "Entfernung". What am I doing wrong?

I have following df ( sample with n = 20)

df <- structure(list(Datum = structure(c(6L, 4L, 1L, 5L, 1L, 5L, 2L, 
1L, 1L, 5L, 2L, 4L, 4L, 4L, 5L, 1L, 4L, 6L, 4L, 5L), .Label = c("2021-03-17", 
"2021-04-07", "2021-04-13", "2021-04-27", "2021-05-11", "2021-05-27"
), class = "factor"), Soll = c("2484", "1202", "1202", "1202", 
"2484", "172", "2484", "552", "1202", "552", "119", "149", "119", 
"149", "149", "1202", "172", "172", "1189", "1202"), Plot = c("1", 
"7", "3", "4", "1", "3", "4", "3", "1", "2", "6", "8", "4", "3", 
"6", "2", "3", "6", "5", "4"), Behandlung = structure(c(2L, 1L, 
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 
1L, 1L), .Label = c("a", "b"), class = "factor"), Entfernung = structure(c(1L, 
1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 
2L, 1L, 2L), .Label = c("2", "5"), class = "factor"), Feuchte = c(16.7, 
19.3, 36.52, 16.52, 31.4, 12.2, 27.65, 35.38, 35.2, 12.4, 26.62, 
13.07, 10.55, 14.97, 9.38, 33.48, 12.95, 21.98, 11.32, 14.52), 
    Korn = c("56.8", "62.9", "65.6", "42.8", "56.8", "44.2", 
    "43.4", "59.9", "60.8", "68.9", "65.5", "51.8", "56.7", "59.9", 
    "62.9", "66.9", "52.3", "61.4", "56.1", "56.1"), DGUnkraut = c(5, 
    25, 1, 0.5, 1, 0.3, 1, 0.1, 1.5, 6, 0.3, 0, 3, 0, 0, 1.5, 
    0.2, 8, 1, 0.5), Ertrag = c(56.8, 62.9, 65.6, 42.8, 56.8, 
    44.2, 43.4, 59.9, 60.8, 68.9, 65.5, 51.8, 56.7, 59.9, 62.9, 
    66.9, 52.3, 61.4, 56.1, 56.1), Datum.y = structure(c(3L, 
    2L, 1L, 3L, 1L, 2L, 1L, 1L, NA, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 
    2L, 3L, 3L, 2L), .Label = c("2021-04-08", "2021-05-17", "2021-07-07"
    ), class = "factor"), DGKultur = c(75L, 65L, 65L, 80L, 38L, 
    60L, 45L, 27L, NA, 76L, 65L, 20L, 55L, 90L, 70L, 73L, 65L, 
    85L, 98L, 70L), DGGes = c(75L, 85L, 65L, 80L, 38L, 60L, 45L, 
    27L, NA, 78L, 65L, 20L, 55L, 90L, 70L, 73L, 65L, 85L, 98L, 
    70L), Hoehe_FA = c(75L, 38L, 17L, 70L, 10L, 40L, 13L, 12L, 
    NA, 42L, 15L, 14L, 44L, 50L, 50L, 46L, 40L, 83L, 87L, 40L
    )), class = "data.frame", row.names = c(NA, -20L))

And this is the code to fit the model

lmer(Ertrag ~ Feuchte * DGUnkraut + Entfernung + Datum + Behandlung  +(1|Soll/Plot), data = .) -> model

What am I missing? Also I dont want each date to be shown as a factor, but rather date as factor and then each date as level. Thanks a lot for your help! Cheers

Effigy
  • 145
  • 7
  • 1
    Most model functions in R absorb one level of a categorical predictor into the intercept. So here the reported effect "Behandlungb" is for the "b" level of "Behandlung" relative to the "a" level. And you should convert "Datum" to proper Date data using the `as.Date` function. Right now it is being treated as factor data, which is essentially categorical (hence the calculated coefficients for several different dates). – jdobres Feb 14 '22 at 16:10
  • Hey, thanks a lot for your answer! Is there a way to see the estimate and slope for the level that is not shown explicitly? I still dont understand what would be the slope for level "a" of Behandlung.. Also, the output looks the same when I convert "date" as Date, but I think I want it to be a factor anyway.. – Effigy Feb 14 '22 at 16:44
  • 1
    The short answer is no. See here for details: https://stackoverflow.com/questions/30177943/lm-function-in-r-does-not-give-coefficients-for-all-factor-levels-in-categorical – jdobres Feb 14 '22 at 16:49
  • 3
    To me, the short answer is that the coefficient for the `a` level is zero with a standard error of zero. It is fixed at zero for model identification purposes and thus has no sampling variability. – DaveArmstrong Feb 14 '22 at 17:23

0 Answers0