How to replace "for" loops with an efficient algorithm in R when calculating with unique pairs

Question

I am looking for an efficient algorithm to perform the below function:

 for(j in 1:nrow) { #begin loop over j from 1 to nrow
   xJ=vectorX1[j] #some random vector
   yJ=vectorY1[j] #some random vector

   for(ij in j:nrow) { #begin loop over ij from j to nrow
     xIJ=vectorX1[ij] #some random vector
     yIJ=vectorX1[ij] #some random vector

     if(j != ij) { #only perform on unique pairs            
        XX=myfun(xJ, yJ, xIJ, yIJ)
     }
   }
 }

My vectors are pretty long do the for loops are time sinks. Any help would be greatly appreciated.

You might look into using `apply`, as it is much faster than a for-loop in R. This post might help: http://stackoverflow.com/a/7141669/2146843 — flyingfinger, Jul 08 '15 at 19:27

score 0 · Answer 1 · answered Jul 08 '15 at 20:06

For pairwise operations, you might find the dist() function in the proxy package useful. You can provide your own function to replace the usual distance function. Using the simple example below, with nrow = 1000, the dist() approach took less than half the time of your for() loop approach.

nrow <- 1000
vectorX1 <- 1:nrow
vectorY1 <- nrow+(1:nrow)
m <- cbind(vectorX1, vectorY1)
myfun <- function(row1, row2) row1[1]*row1[2] + row2[1]*row2[2]

library(proxy)
dist(m, method=myfun)

How to replace "for" loops with an efficient algorithm in R when calculating with unique pairs

1 Answers1