data.table: difference to first of group

Question

I have data

dat <- data.table(id=1:8, group=c(1,1,2,2,2,3,3,3), val=c(4,10,5,10,10,6,10,10))

> dat
   id group val
1:  1     1   4
2:  2     1  10
3:  3     2   5
4:  4     2  10
5:  5     2  10
6:  6     3   6
7:  7     3  10
8:  8     3  10

and I would like to subtract from each the first value of its respective group.

> res
   id group val dif
1:  1     1   4   0
2:  2     1  10   6  
3:  3     2   5   0
4:  4     2  10   5   
5:  5     2  10   5
6:  6     3   6   0
7:  7     3  10   4
8:  8     3  10   4

I am always astonished by the efficiency of data.table so I'm wondering whether it can offer a solution. Of course any other efficient method is just as welcome.

Your `val` is integres how did it change to values with decimals? — Onyambu, Feb 27 '18 at 16:59
You have posted quite a few `data.table` questions recently, with several nice answers. Please show us that you learned from them and show us what you have tried. SO is not a free code writing service (yes, I know our FGITWs disagree). — Henrik, Feb 27 '18 at 17:05
I would add that you could easily find the answer of basic use of data.table here before asking : https://cran.r-project.org/web/packages/data.table/vignettes/datatable-intro.html — denis, Feb 27 '18 at 17:10
To second [Henrik](https://stackoverflow.com/questions/49014011/data-table-difference-to-first-of-group#comment85032989_49014011), here is the link to a very similar question posted by the OP today: https://stackoverflow.com/q/49013507/3817004 — Uwe, Feb 27 '18 at 18:16
Now I go through the trouble of dissecting the problem into different aspects (so as to make it more generally informative to others) but apparently this is not okay. Should I post them all in one question that ends up so specific that it won't even be potentially useful to others? — bumblebee, Feb 27 '18 at 21:28

score 2 · Accepted Answer · answered Feb 27 '18 at 17:04

2

dat[,diff:=val-val[1],by=group]
dat
   id group val diff
1:  1     1   4    0
2:  2     1  10    6
3:  3     2   5    0
4:  4     2  10    5
5:  5     2  10    5
6:  6     3   6    0
7:  7     3  10    4
8:  8     3  10    4

answered Feb 27 '18 at 17:04

Onyambu

67,392
3
24
53

score 0 · Answer 2 · answered Feb 27 '18 at 17:07

With Tidyverse (dplyr) you can do this:

library(dplyr)

dat <- data.table(id=1:8,
  group=c(1,1,2,2,2,3,3,3), 
  val=c(4,10,5,10,10,6,10,10)
  )  

dat %>%
  group_by(group) %>%
  mutate(dif = abs(first(val)-val))

#># A tibble: 8 x 4
#># Groups:   group [3]
#>     id group   val   dif
#>  <int> <dbl> <dbl> <dbl>
#>1     1  1.00  4.00  0   
#>2     2  1.00 10.0   6.00
#>3     3  2.00  5.00  0   
#>4     4  2.00 10.0   5.00
#>5     5  2.00 10.0   5.00
#>6     6  3.00  6.00  0   
#>7     7  3.00 10.0   4.00
#>8     8  3.00 10.0   4.00

data.table: difference to first of group

2 Answers2