2

On page 423 in Computational Laboratory for Economics, it is stated that "the argument model="fd" doesn't work correctly, with the current version (1.3-1) of plm, on unbalanced data with holes." Has this been fixed in the newer versions of plm?

As a workaround, the author used diff to obtain the first differences and fitted a model="pooling"on the differenced data. Can someone explain how the diff function works on unbalanced data with holes?

Also, on page 68 in the plm documenation version (1.6-5), it is stated that "plm is a general function for the estimation of linear panel models. It supports the following estimation methods: pooled OLS (model = "pooling"), fixed effects ("within"), random effects ("random"), first–differences ("fd"), and between ("between"). It supports unbalanced panels and two–way effects (although not with all methods)."

M_M
  • 899
  • 8
  • 21
  • 2
    Your question about how `plm` handles FD models and how to treat gaps is answered in detail in this two answers:https://stackoverflow.com/questions/39364471/residuals-from-first-differenced-regression-on-unbalanced-panel/39378265#39378265 https://stackoverflow.com/questions/43926625/r-plm-lag-what-is-the-equivalent-to-l1-x-in-stata/43932317#43932317 – Helix123 Oct 18 '17 at 16:33
  • So, using the `diff` function on unbalanced data with holes wouldn't really work - rather it should be something like `pTestData$Y_diff <- plm:::lagt.pseries(pTestData$Y) - pTestData$Y` as in your other answer, right? – M_M Oct 18 '17 at 16:45
  • 1
    Yes, I don't know why the authors of that text book came up with the idea of `diff` (this could either be `base::diff` or `plm::diff` and both give row-wise differences, the latter respecting the panel structure of the data). – Helix123 Oct 18 '17 at 16:47
  • So, bottom line, `model="fd"`should not be used with unbalanced data. `plm:::lagt.pseries` works for both balanced and unbalanced panel data. – M_M Oct 18 '17 at 17:50
  • 3
    It is not about the unbalancedness, that works fine. It is about gaps in the time dimensions as the standard way is diff-ing of neighbouring rows per individual (taking the panel structure into account). Use `is.pconsecutive` to check for any gaps. – Helix123 Oct 18 '17 at 18:04
  • Possible duplicate of [Is there a predict function for PLM in R?](https://stackoverflow.com/questions/7123060/is-there-a-predict-function-for-plm-in-r) – Eric Fail Oct 18 '17 at 20:03
  • @Helix123, this could be a separate question and I am happy to do so. when comparing the fd and pooling models on the data and the differenced data as we discussed, I get the same results as expected but not always! My data is indexed as a panel by ids and serial time variable which combines month and year. In the model, I add month and year dummies to control for seasonality and trend. In this example, the results of the two models (fd/pooling) are dramatically different when I add the year dummies. Any idea why? – M_M Oct 18 '17 at 23:39

0 Answers0