0

I have data frame with two variables: PPCE and PDPI (examples are from Gujarati Basic Econometrics textbook). I run first regression:

lm(df$PPCE ~ df$PDPI) -> lm1

then create lagged series od PDPI with lag one:

c(NA, head(PDPI, -1)) -> lagged1

and then run second regression:

lm(df$PPCE ~ df$PDPI + lagged1) -> lm2

When I run anova(lm1, lm2) to find out should I include lagged variable of PDPI I get:

Error in anova.lmlist(object, ...) : models were not all fitted to the same size of dataset

So my questio is, how can I check with anova funcion in R if lagged variables should be included in the model?

Antti29
  • 2,953
  • 12
  • 34
  • 36
Nikola
  • 61
  • 8
  • `lm` has an argument `na.action` that defaults to `na.omit`. if you want to keep the sizes of the datasets equal, maybe you could try `na.action = NULL`. – Rui Barradas Nov 04 '17 at 21:35
  • See this post [https://stackoverflow.com/questions/18387258/r-error-which-says-models-were-not-all-fitted-to-the-same-size-of-dataset] – kamila Jan 17 '18 at 09:25

1 Answers1

0

The dyn package's anova.dyn method can handle differing size fits by intersecting them. It works with zoo and ts objects. The example below uses the builtin BOD data frame. See the dyn package documentation for more info.

library(dyn) # also loads zoo

z <- read.zoo(BOD)

lm1 <- dyn$lm(z ~ time(z))
lm2 <- dyn$lm(z ~ time(z) + lag(z, -1))

anova(lm1, lm2)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Thanks. It works even with data frame, it doesn't have to be zoo or ts. – Nikola Nov 05 '17 at 13:17
  • What's strange is that isn't working now. When I convert my data.frame to zoo (as.zoo(df) -> df) and apply lm or dyn$lm: dyn$lm(df$PPCE ~ df$PDPI + lag(df$PDPI, -1)) -> lm1 I do not get 3 coefficient as I should, I get coefficient for every observatioin. – Nikola Nov 05 '17 at 13:52
  • (1) That is just a data file. Please edit your question and include a minimal reproducible complete example without resorting to external links. It should be possible for anyone else to simply copy the code in the question and paste it into their session to see the same problem you are having. See [mcve] and `?dput` (2) Also regarding using data frames that won't work because `lag(x, -1)` won't give a useful result if x is not a time series object. – G. Grothendieck Nov 05 '17 at 16:44