-3

I want to compute the following time series regression using R:

$\Delta y_t=\beta_1 \Delta x_t+\beta_2 \Delta z_t+\beta_3 \Delta m_t+\beta_4 \Delta y_{t−1}$

Since I have not that much experience with R I want to ask if the following R code gives me what I want:

y <- ts(diff(YY))
x <- ts(diff(XX))
z <- ts(diff(ZZ))
m <- ts(diff(MM))
l1 <- lag(y, k=-1)
int <- ts.intersect(y, x, z, m, l1)
reg1 <- lm(y~x+z+m+l1, data=int)
summary(reg1)` 

Sorry but I can`t find the typo in my formula.

Here is a data sample:

Date         YY     XX       ZZ      MM
03.01.2005  2.154   2.089   0.001   344999
04.01.2005  2.151   2.084   0.006   344999
05.01.2005  2.151   2.087   -0.007  333998
06.01.2005  2.15    2.085   -0.005  333998
07.01.2005  2.146   2.086   -0.006  333998
10.01.2005  2.146   2.087   -0.007  333998
11.01.2005  2.146   2.089   -0.009  333998
12.01.2005  2.145   2.085   -0.005  339999
13.01.2005  2.144   2.084   -0.004  339999
14.01.2005  2.144   2.085   -0.005  339999
17.01.2005  2.143   2.085   -0.005  339999
18.01.2005  2.144   2.085   -0.005  347999
19.01.2005  2.143   2.086   -0.006  354499
20.01.2005  2.144   2.087   -0.007  354499
21.01.2005  2.143   2.087   -0.007  354499
24.01.2005  2.143   2.086   -0.006  354499
25.01.2005  2.144   2.086   -0.006  354499
26.01.2005  2.143   2.086   -0.006  347999
27.01.2005  2.144   2.085   -0.005  352998
28.01.2005  2.144   2.084   -0.004  352998
31.01.2005  2.142   2.084   -0.004  352998
01.02.2005  2.142   2.083   -0.003  352998
02.02.2005  2.141   2.083   -0.003  357499
03.02.2005  2.144   2.088   -0.008  357499
04.02.2005  2.142   2.084   -0.004  357499
07.02.2005  2.142   2.084   -0.004  359999
08.02.2005  2.141   2.083   -0.003  355500

I tried what fg nu answered to my original question but get an error message. 1. zooX = zoo(test4[, -1], order.by = test4$Date) this comand works fine. (The first column of my data set is the date column, so my dataset looks exactly like the data sample in my question.) 2. I ran the regression: lmX = dynlm(d(YY) ~ d(XX) + d(ZZ) + d(MM) + L(YY, 1), data = zooX) Here I get the following error message: Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 0 (non-NA) cases In addition: Warning message: In dynlm(d(YY) ~ d(XX) + d(ZZ) + d(MM) + L(YY, 1), data = zooX) : empty model frame specified What is the bug I am overseeing?

Michael B
  • 257
  • 1
  • 2
  • 11
  • Please provide a reproducible example http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Steven Beaupré Apr 28 '15 at 22:14
  • OLS could be done if `int` is set up as required. However autocorrelation may not be handled correctly and it is better to check the residual plot, etc. – Jaehyeon Kim Apr 29 '15 at 01:37
  • Thanks! what do you mean by if `int` is set up as required? – Michael B Apr 29 '15 at 10:57
  • You probably have a character column in your data, which means that your entire `zoo` matrix will be upcast to character, and that means that `lm` complains about the `0 (non-NA) cases`, as numbers represented as string/character are not valid. – tchakravarty Apr 29 '15 at 11:44
  • ok, I deleted the date column form my data set and then ran: `zooX = zoo(test4)` followed by `lmX = dynlm(d(YY) ~ d(XX) + d(ZZ) + d(MM) + L(YY, 1), data = zooX) summary(lmX)` now it seems to work. What I still not understand in your code is, if `L(YY, 1)` is the lag of the difference of `YY` or if it is just the lag of `YY` i.e. not differences? – Michael B Apr 29 '15 at 12:01
  • I have updated my answer to reflect the last point you made. – tchakravarty Apr 29 '15 at 12:26
  • Many thanks! So there is no difference in the output if one takes your code or mine is this correct? The last question regarding my original question is, that if I take `l1 <- ts(lag(y, k=-1))` instead of `l1 <- lag(y, k=-1)` (I only added the term `ts`) all else equal and then run the regression, I get the following error message: `Warning message: In summary.lm(reg1) : essentially perfect fit: summary may be unreliable`. What is the problem there? If however I delete the `ts` infront of the lagged variable it works perfectly fine. – Michael B Apr 29 '15 at 13:24

1 Answers1

1

Use the dynlm package. Here is an example using the data you supplied:

library(dynlm)

dfX = read.table(
  textConnection(
    "Date         YY     XX       ZZ      MM
  03.01.2005  2.154   2.089   0.001   344999
  04.01.2005  2.151   2.084   0.006   344999
  05.01.2005  2.151   2.087   -0.007  333998
  06.01.2005  2.15    2.085   -0.005  333998
  07.01.2005  2.146   2.086   -0.006  333998
  10.01.2005  2.146   2.087   -0.007  333998
  11.01.2005  2.146   2.089   -0.009  333998
  12.01.2005  2.145   2.085   -0.005  339999
  13.01.2005  2.144   2.084   -0.004  339999
  14.01.2005  2.144   2.085   -0.005  339999
  17.01.2005  2.143   2.085   -0.005  339999
  18.01.2005  2.144   2.085   -0.005  347999
  19.01.2005  2.143   2.086   -0.006  354499
  20.01.2005  2.144   2.087   -0.007  354499
  21.01.2005  2.143   2.087   -0.007  354499
  24.01.2005  2.143   2.086   -0.006  354499
  25.01.2005  2.144   2.086   -0.006  354499
  26.01.2005  2.143   2.086   -0.006  347999
  27.01.2005  2.144   2.085   -0.005  352998
  28.01.2005  2.144   2.084   -0.004  352998
  31.01.2005  2.142   2.084   -0.004  352998
  01.02.2005  2.142   2.083   -0.003  352998
  02.02.2005  2.141   2.083   -0.003  357499
  03.02.2005  2.144   2.088   -0.008  357499
  04.02.2005  2.142   2.084   -0.004  357499
  07.02.2005  2.142   2.084   -0.004  359999
  08.02.2005  2.141   2.083   -0.003  355500"
  ), header = TRUE)
dfX$Date = as.Date(dfX$Date, format = "%d.%m.%Y")

# convert to zoo format
zooX = zoo(dfX[, -1], order.by = dfX$Date)

# run a regression with time transformed regressors
lmX = dynlm(d(YY) ~ d(XX) + d(ZZ) + d(MM) + d(L(YY, 1)), data = zooX)
summary(lmX)

This gives the output:

> summary(lmX)

Time series regression with "zoo" data:
Start = 2005-01-05, End = 2005-02-08

Call:
dynlm(formula = d(YY) ~ d(XX) + d(ZZ) + d(MM) + d(L(YY, 1)), 
    data = zooX)

Residuals:
       Min         1Q     Median         3Q        Max 
-0.0039592 -0.0003746  0.0000854  0.0006254  0.0018715 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)  
(Intercept) -5.008e-04  2.766e-04  -1.811   0.0853 .
d(XX)        2.943e-01  2.409e-01   1.222   0.2359  
d(ZZ)        2.038e-03  1.715e-01   0.012   0.9906  
d(MM)        7.808e-08  8.251e-08   0.946   0.3553  
d(L(YY, 1)) -1.677e-01  2.103e-01  -0.797   0.4346  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.001248 on 20 degrees of freedom
Multiple R-squared:  0.2579,    Adjusted R-squared:  0.1095 
F-statistic: 1.738 on 4 and 20 DF,  p-value: 0.1813
tchakravarty
  • 10,736
  • 12
  • 72
  • 116
  • thanks. But why would you use the dynlm package? I cant reproduce your code in R. Here is what I did:`zooX = zoo(test4[, -1], order.by = test4$Date)` , since my CSV file which I loaded into R is called test4). But when I hit:`lmX = dynlm(d(YY) ~ d(XX) + d(ZZ) + d(MM) + L(YY, 1), data = zooX)` the following error message pops up: `Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 0 (non-NA) cases In addition: Warning message: In dynlm(d(YY) ~ d(XX) + d(ZZ) + d(MM) + L(YY, 1), data = zooX) : empty model frame specified`. shouldn't your result be equal to mine? – Michael B Apr 29 '15 at 10:47
  • @MichaelB Because the `dynlm` is built exactly to simplify the kinds of time series regressions you are running. Please add a description of the problem that you are facing to the question. My guess is that the date column is not the first column of your dataset (`test4[, -1]`). – tchakravarty Apr 29 '15 at 11:00
  • Thanks! But aside from that my code is not efficient, does it give me the result I am looking for? (as in my question: regessing my diffenced dependent variable y on my differenced independent variables x,z,m plus on the first lag of the differenced dependent variable y.) Even though its not efficient is my code specified correctly? (and will it produce the same regression results as your code? To check if my code works I tried to calculate the ADF test (from the package `urca`) by hand (i.e. using ts. intersect) and I get the same result, but am still worried, that there might be a mistake. – Michael B Apr 29 '15 at 11:27