0

I would like to create a variable that counts starting from 1 for each unique value (of a specific variable or group of variables) in a data.table.

(mydata <- data.frame(cat   = c('a','a','a','b','c','c','c','c'),
                     color = c(  1,  1,  1,  2,  1,  1,  1,  1),
                     hat   = c(  1,  1,  2,  2,  1,  2,  1,  2),
                     shoe  = c(  0,  1,  1,  2,  1,  1,  1,  3)))
    cat color hat shoe
1     a     1   1    0
2     a     1   1    1
3     a     1   2    1
4     b     2   2    2
5     c     1   1    1
6     c     1   2    1
7     c     1   1    1
8     c     1   2    3

foo(mydata)
    cat color hat shoe  i
1     a     1   1    0  1
2     a     1   1    1  2
3     a     1   2    1  3
4     b     2   2    2  1
5     c     1   1    1  1
6     c     1   2    1  2
7     c     1   1    1  3
8     c     1   2    3  4

I was thinking something along the lines mydata[, count = 1:length(Cat), by=.(Cat)] but that does not seem to work.

I believe this question is distinct from Create counter with multiple variables because it specifically asks about the implementation of data.table. However, the second alternative answer to that question is the same as the answer to this question, so I am okay calling this a duplicate. Apologies.

Community
  • 1
  • 1
Francis Smart
  • 3,875
  • 6
  • 32
  • 58

1 Answers1

3

You guess was very close, you just need to use sequence

> DT <- data.table(mydata)
> DT[, i := sequence(.N), by = cat]
> DT
   cat color hat shoe i
1:   a     1   1    0 1
2:   a     1   1    1 2
3:   a     1   2    1 3
4:   b     2   2    2 1
5:   c     1   1    1 1
6:   c     1   2    1 2
7:   c     1   1    1 3
8:   c     1   2    3 4

By changing by argument you can create other sequences depending on the groups of columns.

Update As suggested in the comments by @veerendra-gadekar, it's possible to do it in even more compact way using setDT

> setDT(mydata)[, i := sequence(.N), by = cat]
> mydata
   cat color hat shoe i
1:   a     1   1    0 1
2:   a     1   1    1 2
3:   a     1   2    1 3
4:   b     2   2    2 1
5:   c     1   1    1 1
6:   c     1   2    1 2
7:   c     1   1    1 3
8:   c     1   2    3 4
romants
  • 3,660
  • 1
  • 21
  • 33