What is the fastest calculation for bipartite distance in R with a parallelized Rcpp backend?
parallelDist
is a great package with a cpp backend and support for multi-threading, but does not support bipartite distance calculations (to my knowledge).
Using parallelDist()
for bipartite distance matrix computation. This involves calculating m1:m1 and m2:m2 in addition to m1:m2 -- highly inefficient.
library(parallelDist)
bipartiteDist <- function(matrix1,matrix2){
matrix12 <- rbind(matrix1,matrix2)
d <- parallelDist(matrix12)
d <- as.matrix(d)[(1:nrow(matrix1)),((nrow(matrix1)+1):(nrow(matrix1)*2))]
d
}
matrix1 <- abs(matrix(rnorm(1000),10,100000))
matrix2 <- abs(matrix(rnorm(1000),10,100000))
dist <- bipartiteDist(matrix1, matrix2)
This approach is faster than pDist or a pure R implementation when more than 3 cores are available.
pdist
is perfect for computing bipartite distances, but does not support multithreading.
Any fast implementations for parallelized bipartite distance computation?