R: detecting duplicated of specific columns

Question

How do I detect in R duplicates of a specific columns? I know the duplicated() function, but it gives any duplicates, while I'm interested only if one specific column is duplicated. Example:

> x = 1:5
> y=6:10
> z=11:15
> mat=cbind(x,y,x,x,y,z)
> mat
     x  y x x  y  z
[1,] 1  6 1 1  6 11
[2,] 2  7 2 2  7 12
[3,] 3  8 3 3  8 13
[4,] 4  9 4 4  9 14
[5,] 5 10 5 5 10 15

now checking for duplicates

 > which(duplicated(mat, MARGIN=2))
 [1] 3 4 5

So indeed columns 3,4 and 5 are duplicated in the matrix, but I would like to be able to get query a specific column. For example

somehow_specific_duplicated(mat[,1], mat) 
[1] 3 4

Anyone knows an easy way to achieve that?

Thanks!

akrun · Accepted Answer · 2014-12-14T07:06:02.500

0

You could try

unname(which(!colSums(mat[,1]!=mat))[-1])
#[1] 3 4

For the second column

 unname(which(!colSums(mat[,2]!=mat))[-1])
 #[1] 5

edited Dec 14 '14 at 07:06

answered Dec 14 '14 at 07:00

akrun

874,273
37
540
662

R: detecting duplicated of *specific* columns

1 Answers1

R: detecting duplicated of specific columns