0

How to create a lag variable in unbalanced time series table (unevenly spaced in time) ? From data table -

ts = data.table(time=c(10,15,22,25,28),value=c(7,2,14,22,11), key="time")

How do you create lagged variable value_lagged of value by time-5 ?

time value value_lagged
10 7 NA
15 2 7
22 14 2
25 22 2
28 11 14

My usual solution is to create a huge cross-join table and then use 'by' + 'min' to get back to initial state, but this doesn't work for 1M tables.

This question has an answer here - https://stackoverflow.com/a/36754846/1317325 but too hard to find.

Community
  • 1
  • 1
RInatM
  • 1,208
  • 1
  • 17
  • 39
  • Just post that comment by referencing to the one who posted it as an answer there. Or if you want to post as Q/A style then include more details than what you have in the body of the question. Not just a link. – M-- May 17 '17 at 18:08
  • I have been stumbling upon this problem for 3 years now, each time spending an hour or so to figure out fastest solution As Arun't comment no so easy to find, it's better be here – RInatM May 17 '17 at 18:09
  • Agreed. But still, don't you think your "question" does not reflect the problem so clearly. Just make it a reproducible example, with more details. Again, don't post just a link. That makes your question to be lost in SO after some time. Unless you want to document it only for yourself. – M-- May 17 '17 at 18:12
  • I knew I'd end up spending more time for edits :) Anyway, thanks for forcing me, now it's much more readable – RInatM May 17 '17 at 18:23

1 Answers1

0

This is copypaste from Arun's comment from here

ts[, value_lagged := ts[.(time=time-5), value, roll=+Inf, rollends=TRUE, mult="first", on="time"]]

For LEAD you can just change the signs -

ts[, value_lead := ts[.(time=time+5), value, roll=-Inf, rollends=TRUE, mult="first", on="time"]]
Community
  • 1
  • 1
RInatM
  • 1,208
  • 1
  • 17
  • 39