I am using R to replicate a study and obtain mostly the same results the author reported. At one point, however, I calculate marginal effects that seem to be unrealistically small. I would greatly appreciate if you could have a look at my reasoning and the code below and see if I am mistaken at one point or another.
My sample contains 24535 observations, the dependent variable "x028bin" is a binary variable taking on the values 0 and 1, and there are furthermore 10 explaining variables. Nine of those independent variables have numeric levels, the independent variable "f025grouped" is a factor consisting of different religious denominations.
I would like to run a probit regression including dummies for religious denomination and then compute marginal effects. In order to do so, I first eliminate missing values and use cross-tabs between the dependent and independent variables to verify that there are no small or 0 cells. Then I run the probit model which works fine and I also obtain reasonable results:
probit4AKIE <- glm(x028bin ~ x003 + x003squ + x025secv2 + x025terv2 + x007bin + x04chief + x011rec + a009bin + x045mod + c001bin + f025grouped, family=binomial(link="probit"), data=wvshm5red2delna, na.action=na.pass)
summary(probit4AKIE)
However, when calculating marginal effects with all variables at their means from the probit coefficients and a scale factor, the marginal effects I obtain are much too small (e.g. 2.6042e-78). The code looks like this:
ttt <- cbind(wvshm5red2delna$x003,
wvshm5red2delna$x003squ,
wvshm5red2delna$x025secv2,
wvshm5red2delna$x025terv2,
wvshm5red2delna$x007bin,
wvshm5red2delna$x04chief,
wvshm5red2delna$x011rec,
wvshm5red2delna$a009bin,
wvshm5red2delna$x045mod,
wvshm5red2delna$c001bin,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped,
wvshm5red2delna$f025grouped) #I put variable "f025grouped" 9 times because this variable consists of 9 levels
ttt <- as.data.frame(ttt)
xbar <- as.matrix(mean(cbind(1,ttt[1:19]))) #1:19 position of variables in dataframe ttt
betaprobit4AKIE <- probit4AKIE$coefficients
zxbar <- t(xbar) %*% betaprobit4AKIE
scalefactor <- dnorm(zxbar)
marginprobit4AKIE <- scalefactor * betaprobit4AKIE[2:20] #2:20 are the positions of variables in the output of the probit model 'probit4AKIE' (variables need to be in the same ordering as in data.frame ttt), the constant in the model occupies the first position
marginprobit4AKIE #in this step I obtain values that are much too small
I apologize that I can not provide you with a working example as my dataset is much too large. Any comment would be greatly appreciated. Thanks a lot.
Best,
Tobias