4

I´m trying to estimate marginal effect of a logit model in which I have several dichotomous explanatory variables.

Let's say the model estimated by

logit<- svyglm ( if_member ~ if_female + dummy_agegroup_2 + dummy_agegroup_3 + dummy_education_2 + dummy_education_3 + dummy_education_4, family = quasibinomial(link = "logit"), design = survey_design)

I know about the marginpred function in survey package, but I am not very familiar with it. I have only dichotomous variebles in the model so I am wondering how to estimate marginal effects by this function, especially I am not sure about the predictat (A data frame giving values of the variables in model to predict at).

jay.sf
  • 60,139
  • 8
  • 53
  • 110
david_sst
  • 55
  • 6
  • http://stackoverflow.com/questions/26468360/how-can-i-generate-marginal-effects-for-a-logit-model-when-using-survey-weights – Carl Jul 21 '16 at 03:24
  • @Carl Yeah I saw this link before I had asked my question. However, there is not the answer on my question. I have tried to modify logitmfx function in accordance with the link you provided but without success. – david_sst Jul 21 '16 at 06:13
  • not sure if it's of any value, but the predicted marginals section of this how-to might help? http://www.asdfree.com/2015/11/statistically-significant-trends-with.html – Anthony Damico Jul 22 '16 at 15:22
  • This might be a better fit on http://stats.stackexchange.com/ – Stedy Jul 22 '16 at 22:57

1 Answers1

2

Are you looking for marginal effects or marginal predictions?

As the name implies, the marginpred() function returns predictions. The argument for predictat is a data frame with both the control variables and the variables that are in the model. Let me emphasize that: control variables should be left out of the model.

library("survey")

odds2prob <- function(x) x / (x + 1)
prob2odds <- function(x) x / (1 - x)
expit <- function(x) odds2prob(exp(x))
logit <- function(x) log(prob2odds(x))

set.seed(1)

survey_data <- data.frame(
  if_female = rbinom(n = 100, size = 1, prob = 0.5), 
  agegroup = factor(sample(x = 1:3, size = 100, replace = TRUE)), 
  education = NA_integer_,
  if_member = NA_integer_)
survey_data["agegroup"] <- relevel(survey_data$agegroup, ref = 3)
# Different probabilities between female and male persons
survey_data[survey_data$if_female == 0, "education"] <- sample(
  x = 1:4, 
  size = sum(survey_data$if_female == 0), 
  replace = TRUE, 
  prob = c(0.1, 0.1, 0.5, 0.3))
survey_data[survey_data$if_female == 1, "education"] <-sample(
  x = 1:4, 
  size = sum(survey_data$if_female == 1), 
  replace = TRUE, 
  prob = c(0.1, 0.1, 0.3, 0.5))
survey_data["if_member"] <- rbinom(n = 100, size = 1, prob = 
                                     expit((survey_data$education - 3)/2))
survey_data["education"] <- factor(survey_data$education)
survey_data["education"] <- relevel(survey_data$education, ref = 3)
survey_design <- svydesign(ids = ~ 1, data = survey_data)

logit <- svyglm(if_member ~ if_female + agegroup + education, 
                family = quasibinomial(link = "logit"), 
                design = survey_design)
exp(cbind(`odds ratio` = coef(logit), confint(logit)))
newdf <- data.frame(if_female = 0:1, education = c(3, 3), agegroup =  = c(3, 3))
# Fails
mp <- marginpred(model = logit, adjustfor = ~ agegroup + education, 
                 predictat = newdf, se = TRUE, type = "response")
logit2 <- svyglm(if_member ~ if_female, 
                family = quasibinomial(link = "logit"), 
                design = survey_design)
mp <- marginpred(model = logit2, adjustfor = ~ agegroup + education, 
                 predictat = newdf, se = TRUE, type = "response")
# Probability for male and for female persons controlling for agegroup and education
cbind(prob = mp, confint(mp))

That's how I estimate marginal effects with the survey package:

# Probability difference between female and male persons
# when agegroup and education are set to 3
svycontrast(full_model, quote(
  (exp(`(Intercept)` + if_female) / (exp(`(Intercept)` + if_female) + 1)) - 
  (exp(`(Intercept)`) / (exp(`(Intercept)`) + 1))))
# Can't use custom functions like expit :_(

There are probably smarter ways, but I hope it helps.

Please note that the difference between the probabilities predicted by marginpred() is different from the difference estimated by svycontrast(). The probabilities predicted by marginpred() don't seem to be affected by changing the value of the control variables (in example, education = c(4, 4) instead of education = c(3, 3)), but the estimates from svycontrast() are affected as implied by the regression model.