What is the most efficient way to calculate a group index (group identifier) across multiple columns in a data frame or data.table in R?
For example, in the data frame below, there are six unique combinations of columns A and B.
DF <- data.frame(a = rep(1:2,6), b = sort(letters[1:3]))
> DF
a b
1 1 a
2 2 b
3 1 c
4 2 a
5 1 b
6 2 c
7 1 a
8 2 b
9 1 c
10 2 a
11 1 b
12 2 c
I'd like to add column 'index' with a group identifier, like the one produced by this (obviously inefficient method for large data frames):
DF$index <- with(DF, as.numeric(factor(paste0(a, b))))
> DF
a b index
1 1 a 1
2 2 b 5
3 1 c 3
4 2 a 4
5 1 b 2
6 2 c 6
7 1 a 1
8 2 b 5
9 1 c 3
10 2 a 4
11 1 b 2
12 2 c 6
What's the fastest way to do this with very large data frames?