I wish to evaluate marginal effects of variables in a logit regression using a dataset like this (with 40k observations):
d1<- structure(list(dummy.eleito = c(1, 0, 0, 0, 0, 1, 1, 1, 1, 0),
dummy.tratamento = c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0),
Escolaridade = c("SUPERIOR_INCOMPLETO", "FUNDAMENTAL_INCOMPLETO",
"SUPERIOR_COMPLETO", "FUNDAMENTAL_INCOMPLETO",
"SUPERIOR_COMPLETO", "SUPERIOR_COMPLETO", "SUPERIOR_INCOMPLETO",
"SUPERIOR_INCOMPLETO", "SUPERIOR_COMPLETO", "SUPERIOR_INCOMPLETO"),
Raca = c("Preta_Parda", "Preta_Parda", "Preta_Parda", "Preta_Parda",
"Preta_Parda", "Preta_Parda", "BRANCA", "BRANCA", "BRANCA", "BRANCA"),
DESCRICAO_SEXO = c("MASCULINO", "MASCULINO", "MASCULINO",
"MASCULINO", "MASCULINO", "MASCULINO", "MASCULINO",
"MASCULINO", "MASCULINO", "MASCULINO"),
votos.cidade = c(6483, 6483, 6483, 6483, 6483, 6483, 4735,
4735, 4735, 4735),
dummy.prefeito = c(0,1, 0, 0, 0, 1, 0, 0, 0, 1),
Intensidade.Trat0.Mun = c(0.0152671755725191, 0.0152671755725191, 0.0152671755725191, 0.0152671751,
0.0152671755725191, 0.01526717, 0.02857142856, 0.028571428, 0.028571, 0.0285714),
Var.Receitas = c(3.25607407, 11.424, 4.5549, -0.832116880227985, 5.78901737320675, -0.02459246,
1.151009, -0.3058719238, 0.742947247, -0.2711)),
.Names = c("dummy.eleito", "dummy.tratamento", "Escolaridade", "Raca",
"DESCRICAO_SEXO", "votos.cidade", "dummy.prefeito", "Intensidade.Trat0.Mun",
"Var.Receitas"), row.names = c(NA, 10L), class = "data.frame")
I run the following regression using glm:
model <- glm(dummy.eleito ~ dummy.tratamento + factor(Escolaridade) +
factor(Raca) + factor(DESCRICAO_SEXO) +
votos.cidade + dummy.prefeito +
dummy.tratamento:Intensidade.Trat0.Mun +
Var.Receitas + Var.Receitas:dummy.tratamento,
data = d1,
family = binomial(link = 'logit'))
Then I evaluate marginal effects at some points:
m <- margins(model, at = list(dummy.tratamento = 1,
Intensidade.Trat0.Mun = fivenum(d1$Intensidade.Trat0.Mun)
Var.Receitas = fivenum(d1$Var.Receitas))
R
tried to run this through the whole night... at the morning, still nothing. Is that normal? Any possible reason? Is the data too complex? Or maybe the regression formula itself? Even if I ran margins
without using the at
specification it still would not go.
Any help?
EDIT:
After updating R, to its newest version, this is what I got in the end:
Running the regressions I needed and the margins
command using the entire dataset, R took time to do the job, but it did in the end.
However, the problem persisted when using the at
parameter inside margins
. I suspect it is because the regression has factor
variables. I think I will probably calculate by hand predicted values of my dependent variable using the parameters that I would put inside the at
command, just to get a grasp of the results.
Any suggested alternatives are welcome.