How to calculate nonlinear (binary) Fixed-Effects Logit for Longitudinal/Panel Data?

Question

I'm trying to estimate child work based on a lagged variable on children's school aspirations.
I'm deciding whether I should use glm or clogit to run my models (need fixed effect logits). When I run my glm, my coefficients are very different from my clogit.

model1 <- glm(chldwork~lag_aspgrade_binned+age+as.factor(childid), data=finaletdtlag, family='binomial')

  GLM Output:
Call:
glm(formula = chldwork ~ lag_aspgrade_binned + age + as.factor(childid), 
    family = "binomial", data = finaletdtlag)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-3.02350   0.00001   0.00002   0.17344   2.13769  

Coefficients:
                                                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)                                     3.037e+01  1.933e+04   0.002   0.9987    
lag_aspgrade_binneddid not complete elementary  2.339e+00  1.083e+00   2.161   0.0307 *  
lag_aspgrade_binnedhs                           1.252e+00  6.082e-01   2.059   0.0395 *  
lag_aspgrade_binnedprimary some hs              1.206e+00  6.739e-01   1.789   0.0735 .  
lag_aspgrade_binnedsome college                 2.081e+00  4.800e-01   4.335 1.46e-05 ***
age                                            -6.123e-01  3.995e-02 -15.326  < 2e-16 ***

Also, when I ran my clogit, I didn't get an intercept in my output (like this example shows: https://data.princeton.edu/wws509/r/fixedRandom3).

My clogit output:

 > modela <- clogit(chldwork~lag_aspgrade_binned+age+strata(childid), data=finaletdtlag, method = 'exact')
> summary(modela)
Call:
coxph(formula = Surv(rep(1, 2770L), chldwork) ~ lag_aspgrade_binned + 
    age + strata(childid), data = finaletdtlag, method = "exact")

  n= 2770, number of events= 2358 

                                                   coef exp(coef) se(coef)       z Pr(>|z|)    
lag_aspgrade_binneddid not complete elementary  1.09351   2.98473  0.83332   1.312  0.18944    
lag_aspgrade_binnedhs                           0.53032   1.69948  0.45095   1.176  0.23959    
lag_aspgrade_binnedprimary some hs              0.49815   1.64567  0.50075   0.995  0.31983    
lag_aspgrade_binnedsome college                 1.00269   2.72560  0.34619   2.896  0.00377 ** 
age                                            -0.36846   0.69180  0.02905 -12.684  < 2e-16 ***

Do I have an error in my code? Do y'all prefer one to the other?

I'm wondering why you include age as an independent variable? In my understanding age would be accounted for by the fixed effects since the change in age from t to t+1 is constant across all individuals. In this case you would be over controlling the model. Please someone correct me if I misunderstood. — avocado1, Jul 05 '22 at 10:00

jay.sf · Accepted Answer · 2020-12-25T18:23:28.443

Calculating a nonlinear (binary) fixed effects (FE) model with least squares dummy variable (LSDV) approach, as you do with glm, does not yet take into account the incidental parameters problem (IPP), and therefore the estimator is biased. Basically the IPP occurs when the number of observations, and thus number of individual dummies, is large in relation to the observed time periods. You can find a statistical explanation of the IPP in this answer on Cross Validated.

survival::clogit as well as bife::bife combined with bife::bias_corr take the IPP into account.

Let's attempt to calculate a binary FE model using glm (see "data" at the bottom of the answer).

s <- summary(g.fit <- glm(y ~ 0 + x1 + x2 + factor(id), data=idc, family='binomial'))$coe
s[-grep("factor", rownames(s)), ]
#      Estimate Std. Error  z value   Pr(>|z|)
# x1 0.80750181 0.32069729 2.517956 0.01180379
# x2 0.08353417 0.05040558 1.657240 0.09747086

Calculating the model with bife::bife initially gives the same result.

summary(b.fit <- bife::bife(y ~ x1 + x2 | id, data=idc))$cm
#      Estimate Std. error  z value  Pr(> |z|)
# x1 0.80750173 0.32070309 2.517911 0.01180533
# x2 0.08353417 0.05040645 1.657212 0.09747662

However, the results are biased upwards and we get smaller estimates when we correct for the IPP:

summary(b.fit.corr <- bias_corr(b.fit))$cm
#      Estimate Std. error  z value  Pr(> |z|)
# x1 0.65672233  0.3192673 2.056967 0.03968939
# x2 0.06672389  0.0498996 1.337163 0.18116946

clogit now already takes the IPP into account.

summary(survival::clogit(y ~ x1 + x2 + strata(id), data=idc))$coe
#         coef exp(coef)  se(coef)        z   Pr(>|z|)
# x1 0.6533630  1.921994 0.2875215 2.272397 0.02306255
# x2 0.0659169  1.068138 0.0449555 1.466270 0.14257474

Note, that bife::bife calculates the standard errors more conservatively. However, using Stata we get the same result as with survival::clogit:

. use http://www.stata-press.com/data/r13/clogitid, clear

. clogit y x1 x2, group(id) noheader
note: multiple positive outcomes within groups encountered.

Iteration 0:   log likelihood = -123.42828  
Iteration 1:   log likelihood = -123.41386  
Iteration 2:   log likelihood = -123.41386  
------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |    .653363   .2875215     2.27   0.023     .0898312    1.216895
          x2 |   .0659169   .0449555     1.47   0.143    -.0221943    .1540281
------------------------------------------------------------------------------

Hence, to answer your question, the results of your clogit approach should be preferred.

The issue with your "missing" intercept is explained as follows. In a linear regression model we have one overall intercept a,

whereas in fixed effects we have many individual intercepts a_i.

bife::bife stores the fixed effects in an object named alpha:

b.fit.corr$alpha
#         1014         1017         1019         1023         1025         1027 
# -1.192037100 -0.740369821 -0.093573868 -0.740850707  0.374930145 -1.179830711 
#         1029         1030         1031         1032         1039         1043 
# -0.702326346 -0.460228413 -1.441325728 -1.286722837 -0.117864675 -2.140867994 
#         1044         1045         1047         1050         1055         1060 
# -2.114299987  0.645971508 -0.436457378 -1.165316816 -1.657762052 -0.720114822 
#         1069         1073         1074         1075         1076         1077 
# -1.637936117 -0.782373571 -1.395162657 -1.395167427 -1.637684316 -1.696587849 
#         1078         1092         1094         1095         1097         1098 
# -1.106264431 -0.127039684 -1.439563580 -1.310283872 -0.778302356 -1.982045758 
#         1099         1104         1105         1108         1110         1113 
# -0.005352088 -0.006881257 -1.802152970 -1.165296728 -0.314567361 -1.817725352 
#         1118         1120         1121         1122         1125         1126 
# -1.110847553 -2.128173826 -1.803025930 -1.164773956 -1.107030040 -2.251146286 
#         1128         1133         1137         1141         1142         1144 
# -0.940981858 -1.416409893 -1.441811848 -0.330724832 -1.657610560 -1.136508411 
#         1146         1147         1148         1149         1150         1154 
# -1.286055926  0.196135238 -1.107793309 -1.637240256 -2.226747493 -0.701304388 
#         1155         1156         1157         1163         1169         1172 
# -1.658472805 -1.654763658 -1.134978654 -2.024766764 -1.440093115 -0.940165139 
#         1176         1181         1186         1187         1191         1195 
# -0.481589127 -2.114877897 -1.137394808 -0.006881257 -1.636654053 -0.027152409

They correspond to the individual dummies in the glm model,

g.fit$coefficients[grep("factor", names(g.fit$coe))]

however—as explained above—those are biased due to the IPP.

Note: I couldn't find a method tho extract the FE with survival::clogit, if someone knows how to do this, please let me know in the comments!

Data:

idc <- readstata13::read.dta13("http://www.stata-press.com/data/r11/clogitid.dta")

I read your excellent answer but then you ask for extraction of FE from the object returned from `survival::clogit`. It seems you must mean something other than the coefficients for x1 and x2? Could you not back calculate them from the model result, by subtracting the `coef %*% data[ , c("x1", "x2")]` from linear predictors in the clogit object? — IRTFM, Dec 25 '20 at 17:23
@IRTFM Thanks, your comment made me realize that I had not yet explained the abbreviation FE=fixed effects anywhere. The FE are actually the *a_it* in the FE formula. Your approach would work better if there were FE as dummies included as it's the case in the LSDV method. However with FE this is not the case, since they are calculated differently. Furthermore those binary FE models are even calculated with maximum likelihood. `bife::bife` however stores the FE whereas `survival::clogit` apparently won't, which is what I think the problem. — jay.sf, Dec 25 '20 at 18:22
When I tried my method every single case had the same result, probably some sort of grand mean `(Intercept)`. — IRTFM, Dec 26 '20 at 02:17

How to calculate nonlinear (binary) Fixed-Effects Logit for Longitudinal/Panel Data?

1 Answers1

Data:

Linked