I have a datatable with a column foo. I'd like to get all the rows which have a duplicate in the column foo.
I thought dt[duplicated(dt$foo),]
was supposed to do that, but for each value in foo that has duplicates, it doesn't return the first row, only the other following rows that have a duplicate.
I don't know if I'm clear, so here is an example :
> dt <- data.table(id = c(1,2,3,4,5,6,7,8,9), foo = c("a","b","b","b","c","c","d","e","e"))
> print(dt)
id foo
1: 1 a
2: 2 b
3: 3 b
4: 4 b
5: 5 c
6: 6 c
7: 7 d
8: 8 e
9: 9 e
> dt[duplicated(dt$foo),]
id foo
1: 3 b
2: 4 b
3: 6 c
4: 9 e
Where I would like :
id foo
2: 2 b
3: 3 b
4: 4 b
5: 5 c
6: 6 c
8: 8 e
9: 9 e
How can I get all the rows ?
Thanks.
EDIT : OK I found out this dt[foo %in% dt[duplicated(dt$foo),]$foo]
, which seems to work (and makes sense). But is it the simplest way to do this ??