I'm waiting for the bife
(binary fixed effects) package to be supported in ggeffects
but in the meantime I'd 1st) like to figure out how I can replicate what ggeffects does, and (very 2nd) like to understand whether I should be using the Average Predicted Effect (APEs) or predicted probabilities.
In normal circumstances, I would run a glm
and then plot the variable of interest using ggeffects
.
testglm <- iris %>%
mutate(LengthDummy = if_else(Sepal.Length > 5, "Long", "Short") %>% as_factor(.)) %>%
glm(LengthDummy ~ Sepal.Width + Petal.Length + Petal.Width + Species, data = ., family = binomial)
ggeffects::ggeffect(testglm, terms = "Petal.Length")
You should get this table:
# Predicted probabilities of LengthDummy
Petal.Length | Predicted | 95% CI
---------------------------------------
1.00 | 1.00 | [0.41, 1.00]
1.50 | 1.00 | [0.24, 1.00]
2.50 | 0.58 | [0.04, 0.98]
3.50 | 0.01 | [0.00, 0.09]
4.00 | 0.00 | [0.00, 0.03]
4.50 | 0.00 | [0.00, 0.01]
5.50 | 0.00 | [0.00, 0.00]
7.00 | 0.00 | [0.00, 0.00]
This output, where Petal.Length is a range of (8) values in the dataset, and all the other model predictors are held at their mean, is produced and can be piped into plot()
(ggeffects::ggeffect(testglm, terms = "Petal.Length")%>% plot()
) or into ggplot2()
.
ggeffects::ggeffect(testglm, terms = "Petal.Length") %>%
ggplot(., aes(x, predicted))+
geom_line()
This time, however, my model (not an iris project obviously, this is an example dataset) requires FE. So, my question is, how do I "compute predicted values for all possible levels and values from a logit model’s predictors" as ggeffects
does for bife
glm?
library(bife)
fetest <- iris%>% mutate(LengthDummy= if_else(Sepal.Length>5, "Long", "Short") %>% as_factor(.)) %>%
bife(LengthDummy~ Sepal.Width+ Petal.Length+Petal.Width | Species, data = ., "logit")
I can obtain the Average Predicted Effects which gives me the average of each predictor's effects using APEsIris <- summary(get_APEs(fetest))
. This of course produces a single average for Petal.Length, not 8. I can compute a dataframe with se and the CI, but it is not the equivalent (single estimates)
object <- APEsIris
estimate <- object[["delta"]]
se <- sqrt(diag(object[["vcov"]]))
z <- estimate/se
p <- 2 * pnorm(-abs(z))
conf.low <- estimate-1.96*se
conf.high <- estimate+1.96*se
term <- names(estimate)
APEsdf <- data.frame(term, estimate, se, conf.low, conf.high, z, p)
colnames(cm) <- c("term", "estimate",
"std.error", "conf.low", "conf.high", "statistic", "p.value")
#colnames(cm) <- c("term", "estimate", "Std. error", "Lower bound", "Upper bound", "statistic", "Pr(> |z|)")
rownames(cm) <- NULL
APEsdf <- as.data.frame(APEsdf)
So what about predicting?
predictedbifes<-predict.bife(fetest)
produces 150 predictions, the number of observations, in the dataset. Even if I graph them, I don't understand how I would manipulate the predictions to vary along the observed values of Petal.Length holding all other predictors at the mean.
In short, how can I reproduce ggeffects
calculations to produce a marginal effects table I can use to graph for different variables in the model of this fixed effect logit?