0

I have a matrix with x columns and y rows. x and y are large values.

              A           B      C         
GeneA         1           0      0        
GeneB         0           0      1
GeneC         0           0      1
GeneD         1           0      1

I want to apply a distance function over rows of the matrix and output a matrix with dimension y*y with each cell having the return value from the function. Output looks like this:

               GeneA        GeneB     GeneC   GeneD       
GeneA            1.0           0.7     0.0       0.4 
GeneB            0.6         0.2     1.0       1.0
GeneC            0.0         0.1     1.0       0.5
GeneD            1.0         0.5     0.1       0.8

My function is:

dist <- function(vec1, vec2) {
  res=length(intersect(vec1,vec2))/length(union(vec1,vec2))
  return (res)
}
  • 1
    Please, provide a minimal reproducible example: [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – PaulS Jun 26 '22 at 18:48
  • Why is your second matrix not symmetric? – PaulS Jun 26 '22 at 19:16

1 Answers1

1

You can try outer and you can define your distance function as you need

outer(
  1:nrow(mat),
  1:nrow(mat),
  Vectorize(function(x, y) sqrt(sum((mat[x, ] - mat[y, ])^2)))
)

or a much simpler one using dist (thank @akrun's reminder)

as.matrix(dist(mat))
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81