I have a data set containing serveral financial data, including some fundamentals. For example, I got debt data from April, but it actually should be lets say Dec. As they are released at a later point in time, i have to lag them back for appr. 4 monhts.
This is what my data looks like (illustration)
k <- c("gvkey1" , "gvkey1" , "gvkey1" , "gvkey1", "gvkey2", "gvkey2", "gvkey2", "gvkey2", "gvkey2", "gvkey3", "gvkey3")
l <- c("Date1", "Date2", "Date3", "Date4" , "Date5" , "Date6" , "Date7" , "Date8" , "Date9" , "Date10" , "Date11" )
m <- c(1:11)
y <- structure(list(a = l, b = k, c = m), .Names = c("Date", "gvkey" , "DLCQ"),
row.names = c(NA, -11L), class = "data.frame")
Date gvkey DLCQ
1 Date1 gvkey1 1
2 Date2 gvkey1 2
3 Date3 gvkey1 3
4 Date4 gvkey1 4
5 Date5 gvkey2 5
6 Date6 gvkey2 6
7 Date7 gvkey2 7
8 Date8 gvkey2 8
9 Date9 gvkey2 9
10 Date10 gvkey3 10
11 Date11 gvkey3 11
and this is the code I already tried:
x <- shift(y$DLCQ, 4L)
However, this gives me back one single vector and basically "deletes" all the other columns (date, gvkey).
[1] NA NA NA NA 1 2 3 4 5 6 7
It should look like something like this:
Date gvkey DLCQ
1 Date1 gvkey1 NA
2 Date2 gvkey1 NA
3 Date3 gvkey1 NA
4 Date4 gvkey1 NA
5 Date5 gvkey2 1
6 Date6 gvkey2 2
7 Date7 gvkey2 3
8 Date8 gvkey2 4
9 Date9 gvkey2 5
10 Date10 gvkey3 6
11 Date11 gvkey3 7
Moreover, since my data is in long format, the code should run for each gvkey separately (e.g. with ,by =gvkey).
Thanks Johannes