I want to do a regression with count data model, where doctor visits is the dependent variable. I did a two-part model with first a probit model for no doctor visit at all or one or more and then a Poisson model for at least one doctor visit. After that, I did a hurdle model as a robustness check, because as far as I know, I should get very similar values for both approaches. I do get nearly the same values for the probit part. I get, however, very different values for the Poisson part. Does anyone have any idea why? Here are the commands I used:
probit_doc <- glm(docbin ~ phi + gender + age + health + educ + smoke + logthinc +
wave + AUS + GER + SWE + NED + ESP + ITA + FRA + DEN + GRE +
SWI + BEL + ISR + CZE + POL + LUX + HUN + POR + SVN + EST +
CRO + LIT + BUL + CYP + FIN + LVA + MAL + ROM,
data=allwaves, family=binomial(link="probit"))
poisson_doc <- glm(I(doc > 0) ~ phi + gender + age + health + educ + smoke +
logthinc + wave + AUS + GER + SWE + NED + ESP + ITA + FRA +
DEN + GRE + SWI + BEL + ISR + CZE + POL + LUX + HUN + POR +
SVN + EST + CRO + LIT + BUL + CYP + FIN + LVA + MAL + ROM,
data=allwaves, family="poisson")
hd_doc <- hurdle(doc ~ phi + gender + age + educ + smoke + logthinc + wave +
AUS + GER + SWE + NED + ESP + ITA + FRA + DEN + GRE + SWI +
BEL + ISR + CZE + POL + LUX + HUN + POR + SVN + EST + CRO +
LIT + BUL + CYP + FIN + LVA + MAL + ROM | phi + gender +
age + health + educ + smoke + logthinc + wave + AUS + GER +
SWE + NED + ESP + ITA + FRA + DEN + GRE + SWI + BEL + ISR +
CZE + POL + LUX + HUN + POR + SVN + EST + CRO + LIT + BUL +
CYP + FIN + LVA + MAL + ROM,
dist="poisson", data=allwaves, zero.dist="binomial", link="probit")