13

Can I specify a Random and a Fixed Effects model on Panel Data using ?

I am redoing Example 14.4 from Wooldridge (2013, p. 494-5) in . Thanks to this site and this blog post I've manged to do it in the package, but I'm curious if I can do the same in the package?

Here's what I've done in the package. Would be grateful for any pointers as to how I can do the same using . First, packages needed and loading of data,

# install.packages(c("wooldridge", "plm", "stargazer"), dependencies = TRUE)
library(wooldridge) 
data(wagepan)

Second, I estimate the three models estimated in Example 14.4 (Wooldridge 2013) using the package,

library(plm) 
Pooled.ols <- plm(lwage ~ educ + black + hisp + exper+I(exper^2)+ married + union +
                  factor(year), data = wagepan, index=c("nr","year") , model="pooling")

random.effects <- plm(lwage ~ educ + black + hisp + exper + I(exper^2) + married + union +
                      factor(year), data = wagepan, index = c("nr","year") , model = "random") 

fixed.effects <- plm(lwage ~ I(exper^2) + married + union + factor(year), 
                     data = wagepan, index = c("nr","year"), model="within")

Third, I output the resultants using to emulate Table 14.2 in Wooldridge (2013),

stargazer::stargazer(Pooled.ols,random.effects,fixed.effects, type="text",
           column.labels=c("OLS (pooled)","Random Effects","Fixed Effects"), 
          dep.var.labels = c("log(wage)"), keep.stat=c("n"),
          keep=c("edu","bla","his","exp","marr","union"), align = TRUE, digits = 4)
#> ======================================================
#>                         Dependent variable:           
#>              -----------------------------------------
#>                              log(wage)                
#>              OLS (pooled) Random Effects Fixed Effects
#>                  (1)           (2)            (3)     
#> ------------------------------------------------------
#> educ          0.0913***     0.0919***                 
#>                (0.0052)      (0.0107)                 
#>                                                       
#> black         -0.1392***    -0.1394***                
#>                (0.0236)      (0.0477)                 
#>                                                       
#> hisp            0.0160        0.0217                  
#>                (0.0208)      (0.0426)                 
#>                                                       
#> exper         0.0672***     0.1058***                 
#>                (0.0137)      (0.0154)                 
#>                                                       
#> I(exper2)     -0.0024***    -0.0047***    -0.0052***  
#>                (0.0008)      (0.0007)      (0.0007)   
#>                                                       
#> married       0.1083***     0.0640***      0.0467**   
#>                (0.0157)      (0.0168)      (0.0183)   
#>                                                       
#> union         0.1825***     0.1061***      0.0800***  
#>                (0.0172)      (0.0179)      (0.0193)   
#>                                                       
#> ------------------------------------------------------
#> Observations    4,360         4,360          4,360    
#> ======================================================
#> Note:                      *p<0.1; **p<0.05; ***p<0.01

is there an equally simple way to do this in ? Should I stick to ? Why/Why not?

Eric Fail
  • 8,191
  • 8
  • 72
  • 128
  • 1
    Wouldn't this be more suited for [stats.se]? – Jaap Feb 28 '18 at 15:27
  • @Jaap, thank you for your comment. I see it as a mainly programmers question, and not really a statistical/Cross Validated question. But I'm happy to move it if you think if belongs in CV. – Eric Fail Feb 28 '18 at 16:09
  • 1
    Please note that `lme4` is about the maximum likely framework, so it won't be the "same": plm's vignette ch. 7 has some comparison to pkg `nlme` which is similar to `lme4` and you should be able to take it from there. – Helix123 Feb 28 '18 at 22:25
  • @Helix123, thank you for your comment. I will look into that. – Eric Fail Mar 01 '18 at 07:07

1 Answers1

24

Excepted for the difference in estimation method it seems indeed to be mainly a question of vocabulary and syntax

# install.packages(c("wooldridge", "plm", "stargazer", "lme4"), dependencies = TRUE)
library(wooldridge) 
library(plm) 
#> Le chargement a nécessité le package : Formula
library(lme4)
#> Le chargement a nécessité le package : Matrix
data(wagepan)

Your first example is a simple linear model ignoring the groups nr.
You can't do that with lme4 because there is no "random effect" (in the lme4 sense).
This is what Gelman & Hill call a complete pooling approach.

Pooled.ols <- plm(lwage ~ educ + black + hisp + exper+I(exper^2)+ married + 
                      union + factor(year), data = wagepan, 
                  index=c("nr","year"), model="pooling")

Pooled.ols.lm <- lm(lwage ~ educ + black + hisp + exper+I(exper^2)+ married + union +
                      factor(year), data = wagepan)

Your second example seems to be equivalent to a random intercept mixed model with nr as random effect (but the slopes of all predictors are fixed).
This is what Gelman & Hill call a partial pooling approach.

random.effects <- plm(lwage ~ educ + black + hisp + exper + I(exper^2) + married + 
                          union + factor(year), data = wagepan, 
                      index = c("nr","year") , model = "random") 

random.effects.lme4 <- lmer(lwage ~ educ + black + hisp + exper + I(exper^2) + married + 
                                union + factor(year) + (1|nr), data = wagepan) 

Your third example seems to correspond to a case were nr is a fixed effect and you compute a different nr intercept for each group.
Again : you can't do that with lme4 because there is no "random effect" (in the lme4 sense).
This is what Gelman & Hill call a "no pooling" approach.

fixed.effects <- plm(lwage ~ I(exper^2) + married + union + factor(year), 
                     data = wagepan, index = c("nr","year"), model="within")

wagepan$nr <- factor(wagepan$nr)
fixed.effects.lm <- lm(lwage ~  I(exper^2) + married + union + factor(year) + nr, 
                     data = wagepan)

Compare the results :

stargazer::stargazer(Pooled.ols, Pooled.ols.lm, 
                     random.effects, random.effects.lme4 , 
                     fixed.effects, fixed.effects.lm,
                     type="text",
                     column.labels=c("OLS (pooled)", "lm no pool.",
                                     "Random Effects", "lme4 partial pool.", 
                                     "Fixed Effects", "lm compl. pool."), 
                     dep.var.labels = c("log(wage)"), 
                     keep.stat=c("n"),
                     keep=c("edu","bla","his","exp","marr","union"), 
                     align = TRUE, digits = 4)
#> 
#> =====================================================================================================
#>                                                Dependent variable:                                   
#>              ----------------------------------------------------------------------------------------
#>                                                     log(wage)                                        
#>                 panel         OLS         panel            linear           panel           OLS      
#>                 linear                    linear       mixed-effects       linear                    
#>              OLS (pooled) lm no pool. Random Effects lme4 partial pool. Fixed Effects lm compl. pool.
#>                  (1)          (2)          (3)              (4)              (5)            (6)      
#> -----------------------------------------------------------------------------------------------------
#> educ          0.0913***    0.0913***    0.0919***        0.0919***                                   
#>                (0.0052)    (0.0052)      (0.0107)         (0.0108)                                   
#>                                                                                                      
#> black         -0.1392***  -0.1392***    -0.1394***       -0.1394***                                  
#>                (0.0236)    (0.0236)      (0.0477)         (0.0485)                                   
#>                                                                                                      
#> hisp            0.0160      0.0160        0.0217           0.0218                                    
#>                (0.0208)    (0.0208)      (0.0426)         (0.0433)                                   
#>                                                                                                      
#> exper         0.0672***    0.0672***    0.1058***        0.1060***                                   
#>                (0.0137)    (0.0137)      (0.0154)         (0.0155)                                   
#>                                                                                                      
#> I(exper2)     -0.0024***  -0.0024***    -0.0047***       -0.0047***      -0.0052***     -0.0052***   
#>                (0.0008)    (0.0008)      (0.0007)         (0.0007)        (0.0007)       (0.0007)    
#>                                                                                                      
#> married       0.1083***    0.1083***    0.0640***        0.0635***        0.0467**       0.0467**    
#>                (0.0157)    (0.0157)      (0.0168)         (0.0168)        (0.0183)       (0.0183)    
#>                                                                                                      
#> union         0.1825***    0.1825***    0.1061***        0.1053***        0.0800***      0.0800***   
#>                (0.0172)    (0.0172)      (0.0179)         (0.0179)        (0.0193)       (0.0193)    
#>                                                                                                      
#> -----------------------------------------------------------------------------------------------------
#> Observations    4,360        4,360        4,360            4,360            4,360          4,360     
#> =====================================================================================================
#> Note:                                                                     *p<0.1; **p<0.05; ***p<0.01

Gelman A, Hill J (2007) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press (a very very good book !)

Created on 2018-03-08 by the reprex package (v0.2.0).

Gilles San Martin
  • 4,224
  • 1
  • 18
  • 31
  • A truly excellent answer. Thanks a lot. Do you happen to know if Gelman and Hill (2007) cover the _difference in estimation method_? Thanks again! – Eric Fail Mar 08 '18 at 09:12
  • 1
    Gelman & hill cover (Restricted) Maximum likelihood and MCMC/Bayesian approaches. But I don't think they cover the methods discussed in the `plm` package – Gilles San Martin Mar 08 '18 at 13:00
  • Great answer. I have a quick question, do (any of) the Pooled OLS, RE or FE models count as strictly "longitudinal" analyses, or would you need to interact the IV with time, as in `educ*year`? Thanks in advance. – Marco Pastor Mayo Dec 08 '19 at 13:18
  • 1
    "Strictly longitudinal" should be defined (the meaning will probably be different for different people). In the mixed model (partial pooling) you generally make the distinction between random intercept and random slope models. In a random slope model like `lmer(lwage ~ year + (1+year|nr), data = wagepan) ` you compute a different slope of lwage~year for each `nr` and then a sort of (weighed) averaged global (meta parameter) slope. – Gilles San Martin Dec 08 '19 at 17:47