1

How do I detect in R duplicates of a specific columns? I know the duplicated() function, but it gives any duplicates, while I'm interested only if one specific column is duplicated. Example:

> x = 1:5
> y=6:10
> z=11:15
> mat=cbind(x,y,x,x,y,z)
> mat
     x  y x x  y  z
[1,] 1  6 1 1  6 11
[2,] 2  7 2 2  7 12
[3,] 3  8 3 3  8 13
[4,] 4  9 4 4  9 14
[5,] 5 10 5 5 10 15

now checking for duplicates

 > which(duplicated(mat, MARGIN=2))
 [1] 3 4 5

So indeed columns 3,4 and 5 are duplicated in the matrix, but I would like to be able to get query a specific column. For example

somehow_specific_duplicated(mat[,1], mat) 
[1] 3 4

Anyone knows an easy way to achieve that?

Thanks!

Ruslan
  • 911
  • 2
  • 11
  • 28

1 Answers1

0

You could try

unname(which(!colSums(mat[,1]!=mat))[-1])
#[1] 3 4

For the second column

 unname(which(!colSums(mat[,2]!=mat))[-1])
 #[1] 5
akrun
  • 874,273
  • 37
  • 540
  • 662