1

In the data.table package in R, for a given data table, I am wondering how an indicator index can be created for the values that are the same in two columns. For example, for the following data table,

> M <- data.table(matrix(c(2,2,2,2,2,2,2,5,2,5,3,3,3,6), ncol = 2, byrow = T))
> M
   V1 V2
1:  2  2
2:  2  2
3:  2  2
4:  2  5
5:  2  5
6:  3  3
7:  3  6

I would like to create a new column that essentially orders the values that are the same for each row of the two columns, so that I can get something like:

> M
   V1 V2 Index
1:  2  2     1
2:  2  2     1
3:  2  2     1
4:  2  5     2
5:  2  5     2
6:  3  3     3
7:  3  6     4

I essentially would like to repeat values of .N above, is there a nice way to do it?

user321627
  • 2,350
  • 4
  • 20
  • 43

1 Answers1

4

We can use .GRP after grouping by 'V1' and 'V2'

M[, Index := .GRP, .(V1, V2)]
akrun
  • 874,273
  • 37
  • 540
  • 662