I want to compute predictions for a fixed effects ols regression by group with the marginaleffects
package. When I generate new data for the prediction and do not specify the specific level of the fixed effects variables that I want the prediction to use, the prediction function calculates the values at am==0
, gear==0
and carb==4
(see below).
Where do these values come from? What's the most sensible way to calculate average predictions in the presence of fixed effects in this example?
library(tidyverse)
library(fixest)
library(marginaleffects)
dat <- mtcars
m1 <- dat %>%
mutate(cyl = as_factor(cyl)) %>%
feols(mpg ~ hp*cyl | am + gear + carb)
etable(m1)
#> m1
#> Dependent Var.: mpg
#>
#> hp -0.1316 (0.0504)
#> cyl6 -47.87 (27.42)
#> cyl8 -21.31 (14.80)
#> hp x cyl6 0.4434 (0.2619)
#> hp x cyl8 0.1471 (0.0665)
#> Fixed-Effects: ----------------
#> am Yes
#> gear Yes
#> carb Yes
#> _______________ ________________
#> S.E.: Clustered by: am
#> Observations 32
#> R2 0.88892
#> Within R2 0.38565
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
pred1 <- predictions(m1,
newdata = datagrid(
hp = c(min(dat$hp):max(dat$hp)),
cyl = levels(dat$cyl)))
pred1
#>
#> Estimate Std. Error z Pr(>|z|) S 2.5 % 97.5 % cyl am gear carb hp
#> 9.76 8.72 1.119 0.263 1.9 -7.33 26.9 8 0 3 4 52
#> 9.78 8.60 1.136 0.256 2.0 -7.09 26.6 8 0 3 4 53
#> 9.79 8.49 1.154 0.249 2.0 -6.84 26.4 8 0 3 4 54
#> 9.81 8.37 1.172 0.241 2.1 -6.60 26.2 8 0 3 4 55
#> 9.82 8.25 1.190 0.234 2.1 -6.35 26.0 8 0 3 4 56
#> --- 274 rows omitted. See ?avg_predictions and ?print.marginaleffects ---
#> 14.09 23.89 0.590 0.555 0.8 -32.73 60.9 8 0 3 4 331
#> 14.11 24.01 0.588 0.557 0.8 -32.94 61.2 8 0 3 4 332
#> 14.12 24.12 0.586 0.558 0.8 -33.15 61.4 8 0 3 4 333
#> 14.14 24.24 0.583 0.560 0.8 -33.37 61.6 8 0 3 4 334
#> 14.16 24.36 0.581 0.561 0.8 -33.58 61.9 8 0 3 4 335
#> Columns: rowid, estimate, std.error, statistic, p.value, s.value, conf.low, conf.high, mpg, cyl, am, gear, carb, hp
Created on 2023-08-22 with reprex v2.0.2