5

Really frustrated by this. I just want to take difference between rows in a data.table. where dif(n) = value(n) - value(n-1). so, compared to what I have, the results should be shifted by 1 position, and the first position for each "variable" should be NA. i.e. dif should be (NA, 4, -2, NA, 1, -8). The first value for each "variable should be NA because there is no position n-1. Any idea how I can modify the function to accomplish this? Would really like to know how I can do this with rollapplyr for the sake of my own understanding.

Thanks.

data.table:

> dt
       variable value
    1:      xyz     3
    2:      xyz     7
    3:      xyz     5
    4:      abc     9
    5:      abc    10
    6:      abc     2
> dt[,dif := rollapplyr(value, 2, function(x){r <- diff(x,lag = 1)}, align = "right"), by = list(variable)]

> dt
   variable value dif
1:      xyz     3   4
2:      xyz     7  -2
3:      xyz     5   4
4:      abc     9   1
5:      abc    10  -8
6:      abc     2   1
user3385769
  • 161
  • 6
  • 16
  • what would you want your last value to be? I see that value 5 of dif is -8 form (2 - 8), what would you want where the 1 is? – Derek Corcoran May 10 '16 at 14:31
  • The more general question is here: http://stackoverflow.com/q/14689424/ (with the same answer as mtoto's) – Frank May 10 '16 at 14:36
  • dif should be (NA, 4, -2, NA, 1, -8). The first value for each "variable should be NA because there is no position n-1. – user3385769 May 10 '16 at 14:37
  • 1
    Fyi, this is why you should put your desired output into the question itself in the first place, so there's no room for misunderstanding it. It's not too late to edit it in now. – Frank May 10 '16 at 14:39
  • You can also do dt[,c(NA,diff(value)), by = variable] – skan Nov 02 '16 at 11:44
  • @skan, this would be a deepcopy of whole `value` – ivan866 Oct 05 '20 at 18:20

2 Answers2

16

We could use shift():

dt[,diff := value - shift(value), by = variable]
> dt
   variable value diff
1:      xyz     3   NA
2:      xyz     7    4
3:      xyz     5   -2
4:      abc     9   NA
5:      abc    10    1
6:      abc     2   -8
mtoto
  • 23,919
  • 4
  • 58
  • 71
  • this just shifts the entire column down. The first value for abc, and xyz should be NA because there is no n-1 value at these positions. This is the behavior I get when I use all other functions with rollapplyr() – user3385769 May 10 '16 at 14:39
  • Thanks. Much appreciated. This definitely works and will use it. I would still really like to understand why I can't use rollapplyr() though. – user3385769 May 10 '16 at 14:41
2

Try this:

dt[,dif := rollapplyr(value, 2, function(x){diff(x,lag = 1)},na.pad=TRUE), by = list(variable)]

> dt
   variable value dif
1:      xyz     3  NA
2:      xyz     7   4
3:      xyz     5  -2
4:      abc     9  NA
5:      abc    10   1
6:      abc     2  -8
dal233
  • 80
  • 1
  • 6
  • Since the call to `rollapplyr` changed it would be great to update the answer. From the `rollapplyr` documentation: `na.pad` deprecated. Use `fill = NA` instead of `na.pad = TRUE`. – hannes101 Aug 08 '17 at 08:45