1

I am trying to vectorize my nested for loop code using apply/mapply/lapply/sapply or any other way to reduce the running time. My code is as follows:

for (i in 1:dim){
 for (j in i:dim){ 
  if(mydist.fake[i,j] != d.hat.fake[i,j]){
    if((mydist.fake[i,j]/d.hat.fake[i,j] > 1.5)|(d.hat.fake[i,j]/mydist.fake[i,j]>1.5)){
        data1 = cbind(rowNames[i],rowNames[j], mydist.fake[i,j], d.hat.fake[i,j], 1)
        colnames(data1) = NULL
        row.names(data1) = NULL
        data = rbind(data, data1)
    }else{
        data1 = cbind(rowNames[i],rowNames[j], mydist.fake[i,j], d.hat.fake[i,j], 0)
        colnames(data1) = NULL
        row.names(data1) = NULL
        data = rbind(data, data1)
        }
      }
    }  
  }
write.table(data, file = "fakeTest.txt", sep ="\t", col.names = FALSE, row.names = FALSE)
  • rowNames is the vector of rownames of all data points
  • data is a dataframe
  • mydist.fake and d.hat.fake are distance matrices (where the diagonal is zero and values of upper and lower triangle is same) and therefore, interested in the transversal of lower triangle (leaving values of diagonals too).
  • The dimensions of the both the matrices are the same.

The major problem I am facing is the vectorization of the j loop where j is initialized as i.

snape
  • 29
  • 1
  • 5
  • 2
    Welcome to stack overflow. [Reproducible examples](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) are the way to go. What are `rowNames` in your example?. – mnel Mar 15 '13 at 03:40
  • 1
    Really need sample data for questions like this. – CHP Mar 15 '13 at 04:07

1 Answers1

8

A vectorized version of your code is:

dist1 <- mydist.fake
dist2 <- d.hat.fake

data <- data.frame(i  = rowNames[row(dist1)[lower.tri(dist1)]],
                   j  = rowNames[col(dist1)[lower.tri(dist1)]],
                   d1 = dist1[lower.tri(dist1)],
                   d2 = dist2[lower.tri(dist2)])

data <- transform(data, outcome = d1/d2 > 1.5 | d2/d1 > 1.5)

I tested it successfully using the following sample data:

X           <- matrix(runif(200), 20, 10)
Y           <- matrix(runif(200), 20, 10)
rowNames    <- paste0("var", seq_len(nrow(X)))
mydist.fake <- as.matrix(dist(X))
d.hat.fake  <- as.matrix(dist(Y))
flodel
  • 87,577
  • 21
  • 185
  • 223
  • Thanks a lot.. that was quick and awesome.. atleast learned something today.. thanks again! – snape Mar 15 '13 at 04:40