In data.table in R, how can we create an sequenced indicator variable by the values of two columns?

Question

In the data.table package in R, for a given data table, I am wondering how an indicator index can be created for the values that are the same in two columns. For example, for the following data table,

> M <- data.table(matrix(c(2,2,2,2,2,2,2,5,2,5,3,3,3,6), ncol = 2, byrow = T))
> M
   V1 V2
1:  2  2
2:  2  2
3:  2  2
4:  2  5
5:  2  5
6:  3  3
7:  3  6

I would like to create a new column that essentially orders the values that are the same for each row of the two columns, so that I can get something like:

> M
   V1 V2 Index
1:  2  2     1
2:  2  2     1
3:  2  2     1
4:  2  5     2
5:  2  5     2
6:  3  3     3
7:  3  6     4

I essentially would like to repeat values of .N above, is there a nice way to do it?

score 4 · Accepted Answer · answered Jun 12 '18 at 20:15

4

We can use .GRP after grouping by 'V1' and 'V2'

M[, Index := .GRP, .(V1, V2)]

answered Jun 12 '18 at 20:15

akrun

874,273
37
540
662

In data.table in R, how can we create an sequenced indicator variable by the values of two columns?

1 Answers1