Index and count unique combination of variables using R, but do NOT remove duplicates

Question

Take this data frame for example:

DT <- data.table(A = rep(1:3, each=4), 
                 B = rep(c(NA,1,2,4), each=3), 
                 C = rep(1:2, 6))

I want to append a column that assign index to unique combinations of A and B, but ignore C. I also want another column that count the number of duplicates, that looks like this:

    A  B C Index Count
 1: 1 NA 1     1     3
 2: 1 NA 2     1     3
 3: 1 NA 1     1     3
 4: 1  1 2     2     1
 5: 2  1 1     3     2
 6: 2  1 2     3     2
 7: 2  2 1     4     2
 8: 2  2 2     4     2
 9: 3  2 1     5     1
10: 3  4 2     6     3
11: 3  4 1     6     3
12: 3  4 2     6     3

I don't want to trim the data frame and (preferably)I don't want to reorder the rows. I tried setDT, such as

setDT(DT)[,.(.I, .N), by = names(DT[,1:2])]

But the I column is not the index I want, and Column C is gone. Thanks in advance!

check this out: https://stackoverflow.com/questions/17421776/how-to-add-count-of-unique-values-by-group-to-r-data-frame — Gaurav Taneja, Sep 22 '17 at 02:07
`DT[, g := rleid(A,B)][, n := .N, by=g][]` I guess there is a dupe somewhere. — Frank, Sep 22 '17 at 02:18
@Frank, your solution worked for the example I gave in the post, but it only works if the data is aligned in this way, if for example, replace the position between B[2] and B[4], your solution will give a wrong result. To make a reproducible new data frame, please try: DT <- data.table(A = rep(1:3, each=4), B = rep(c(NA,1,2,NA,1,2,1,3,2,1,1,3)), C = rep(1:2, 6)) — Chuck C, Sep 22 '17 at 02:35
Ok, if your data is not sorted, there's `DT[, \`:=\`(g = .GRP, n = .N), by=.(A,B)]` I think you'll see similar stuff in the intro vignettes for data.table. — Frank, Sep 22 '17 at 02:38
@Frank - you can also do that a little less esoterically with `DT[, c("g","n") := .(.GRP, .N), by=.(A,B)]` — thelatemail, Sep 22 '17 at 03:02

Index and count unique combination of variables using R, but do NOT remove duplicates

0 Answers0