Are you looking for marginal effects or marginal predictions?
As the name implies, the marginpred()
function returns predictions. The argument for predictat
is a data frame with both the control variables and the variables that are in the model. Let me emphasize that: control variables should be left out of the model.
library("survey")
odds2prob <- function(x) x / (x + 1)
prob2odds <- function(x) x / (1 - x)
expit <- function(x) odds2prob(exp(x))
logit <- function(x) log(prob2odds(x))
set.seed(1)
survey_data <- data.frame(
if_female = rbinom(n = 100, size = 1, prob = 0.5),
agegroup = factor(sample(x = 1:3, size = 100, replace = TRUE)),
education = NA_integer_,
if_member = NA_integer_)
survey_data["agegroup"] <- relevel(survey_data$agegroup, ref = 3)
# Different probabilities between female and male persons
survey_data[survey_data$if_female == 0, "education"] <- sample(
x = 1:4,
size = sum(survey_data$if_female == 0),
replace = TRUE,
prob = c(0.1, 0.1, 0.5, 0.3))
survey_data[survey_data$if_female == 1, "education"] <-sample(
x = 1:4,
size = sum(survey_data$if_female == 1),
replace = TRUE,
prob = c(0.1, 0.1, 0.3, 0.5))
survey_data["if_member"] <- rbinom(n = 100, size = 1, prob =
expit((survey_data$education - 3)/2))
survey_data["education"] <- factor(survey_data$education)
survey_data["education"] <- relevel(survey_data$education, ref = 3)
survey_design <- svydesign(ids = ~ 1, data = survey_data)
logit <- svyglm(if_member ~ if_female + agegroup + education,
family = quasibinomial(link = "logit"),
design = survey_design)
exp(cbind(`odds ratio` = coef(logit), confint(logit)))
newdf <- data.frame(if_female = 0:1, education = c(3, 3), agegroup = = c(3, 3))
# Fails
mp <- marginpred(model = logit, adjustfor = ~ agegroup + education,
predictat = newdf, se = TRUE, type = "response")
logit2 <- svyglm(if_member ~ if_female,
family = quasibinomial(link = "logit"),
design = survey_design)
mp <- marginpred(model = logit2, adjustfor = ~ agegroup + education,
predictat = newdf, se = TRUE, type = "response")
# Probability for male and for female persons controlling for agegroup and education
cbind(prob = mp, confint(mp))
That's how I estimate marginal effects with the survey
package:
# Probability difference between female and male persons
# when agegroup and education are set to 3
svycontrast(full_model, quote(
(exp(`(Intercept)` + if_female) / (exp(`(Intercept)` + if_female) + 1)) -
(exp(`(Intercept)`) / (exp(`(Intercept)`) + 1))))
# Can't use custom functions like expit :_(
There are probably smarter ways, but I hope it helps.
Please note that the difference between the probabilities predicted by marginpred()
is different from the difference estimated by svycontrast()
. The probabilities predicted by marginpred()
don't seem to be affected by changing the value of the control variables (in example,
education = c(4, 4)
instead of education = c(3, 3)
), but the estimates from svycontrast()
are affected as implied by the regression model.