1

my data.table (package: data.table) looks as follows:

set.seed(1)
data <- data.table(time =c(1:3,1:4),groups = c(rep(c("b","a"),c(3,4))), value = rnorm(7))

  groups time      value
1:      b    1 -0.6264538
2:      b    2  0.1836433
3:      b    3 -0.8356286
4:      a    1  1.5952808
5:      a    2  0.3295078
6:      a    3 -0.8204684
7:      a    4  0.4874291

I want to be able to lag the value column by more than one value. Below is an example of the output if value is lagged by 2 (however the lag amount should be set arbitrarily):

groups time      value  lag.value
1      a    1  1.5952808         NA
2      a    2  0.3295078         NA
3      a    3 -0.8204684  1.5952808
4      a    4  0.4874291  0.3295078
5      b    1 -0.6264538         NA
6      b    2  0.1836433         NA
7      b    3 -0.8356286 -0.6264538

please advise

EDIT: This question was posted for the purpose of elaborating on an answer posted here

Community
  • 1
  • 1
greyBag
  • 387
  • 3
  • 14
  • 3
    `data[, lag.value := shift(value, 2L), keyby = groups]`. You'll need to install the [devel version](https://github.com/Rdatatable/data.table/wiki/Installation) – David Arenburg Jul 30 '15 at 10:17
  • Did you say that the `shift` won't work for you? Why `data[, lag.value := lag(value, 2L), groups]` it is not working for you? – akrun Jul 30 '15 at 10:18
  • @akrun where did they mention `shift`? Also, base `lag` won't work here as expected. Are you referring to `dplyr` version? I don't think they mentioned it either. – David Arenburg Jul 30 '15 at 10:22
  • @DavidArenburg He actually posted this while asking a doubt [here](http://stackoverflow.com/questions/26291988/r-how-to-create-a-lag-variable-for-each-by-group) – akrun Jul 30 '15 at 10:25
  • `shift` doesn't work at the moment because I'm having troubles downloading the `devel` version, but i'll figure that problem out myself. `data[, lag.value := lag(value, 2L), groups]` does seem to work without issues. @DavidArenburg why shouldn't base `lag` work? – greyBag Jul 30 '15 at 10:28
  • 1
    Simply because it does an entirely different thing. `dplyr` overrides it. Try it on a fresh session without loading `dplyr` – David Arenburg Jul 30 '15 at 10:29
  • @DavidArenburg Yes, you are right about the `lag`. I loaded `dplyr` so it worked. – akrun Jul 30 '15 at 10:34
  • 1
    May you can also do `data[, lag.value := c(rep(NA, 2), head(value,-2)), groups]` – akrun Jul 30 '15 at 10:37

0 Answers0