3

Hi I have a function in R that I'm trying to optimize for performance. I need to vectorize a for loop. My problem is the slightly convoluted data structure and the way I need to perform lookups using the 'which' command.

Lets say we are dealing with 5 elements (1,2,3,4,5), the 10x2 matrix pairs is a combination of all unique pairs the 5 elements (i.e. (1,2), (1,3),(1,4) ....(4,5)). all_prods is a 10x1 matrix that I need to look up using the pairs while iterating through all the 5 elements.

So for 1, I need to index rows 1, 2, 3, 4 (pairs 1,2 1,3 1,4 and 1,5) from all_prods and so on for 1, 2, 3, 4, 5.

I have only recently switched to R from matlab so would really appreciate any help.

foo <- function(AA , BB , CC ){
    pa <- AA*CC;
    pairs <-  t(combn(seq_len(length(AA)),2));

    all_prods <- pa[pairs[,1]] * pa[pairs[,2]];

    result <- matrix(0,1,length(AA));

    # WANT TO VECTORIZE THIS BLOCK
    for(st in seq(from=1,to=length(AA))){
       result[st] <- sum(all_prods[c(which(pairs[,1]==st), which(pairs[,2]==st))])*BB[st];
    }
   return(result);
}
AA <- seq(from=1,to=5); BB<-seq(from=11,to=15); CC<-seq(from=21,to=25);
results <- foo(AA,BB,CC);

#final results is [7715 164208 256542 348096 431250]

I want to convert the for loop into a vectorised version. Instead of looping through every element st, I'd like to do it in one command that gives me a results vector (rather than building it up element by element)

10 Rep
  • 2,217
  • 7
  • 19
  • 33
user1480926
  • 634
  • 5
  • 12
  • 1
    I suggest you provide some sample data to play with. See http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Roman Luštrik Jun 26 '12 at 09:36
  • I'm not sure that's your bottle neck. You pre-allocate a matrix which is good, but `combn` and `t` are pretty greedy. Have you tried profiling your code? – Roman Luštrik Jun 26 '12 at 09:56
  • I want to convert the for loop into a vectorised version. Instead of looping through every element st, I'd like to do it in one command that gives me a results vector (rather than building it up one by one). My apologies for not making this clearer. I have updated the post. – user1480926 Jun 26 '12 at 09:57
  • I tried an `sapply` solution but it's pretty comparable to what you achieved (kudos to pre-allocating your object). After profiling it would seem the majority of the time is spent in `which`. `Rprof("vekt.txt"); results <- foo(AA,BB,CC); Rprof(); summaryRprof("vekt.txt") ` – Roman Luštrik Jun 26 '12 at 10:10
  • Thank you once again Roman. This is why I want to either vectorise it or rewrite it in another way (my understanding is that the apply family is an elegant way to write a loop, but not a performance improvement). Also a further thank you for introducing me to RProf :) – user1480926 Jun 26 '12 at 10:12
  • Does this answer your question? [Vectorizing which operation across the rows of a matrix](https://stackoverflow.com/questions/60097628/vectorizing-which-operation-across-the-rows-of-a-matrix) – EJoshuaS - Stand with Ukraine Jan 20 '21 at 15:36
  • To potential close vote reviewers: Note, this post entered the queue as part of a burnination of the [which] tag. I disagree that the proposed target is a reasonable duplicate. At the very least a second duplicate target would be required. I have voted to Leave Open. – Ian Campbell Jan 22 '21 at 19:46

1 Answers1

8

You could write your function like this:

foo <- function(AA, BB, CC) {
  pa <- AA*CC
  x <- outer(pa, pa)
  diag(x) <- 0
  res <- colSums(x)*BB
  return(res)
}

The key idea is to not break the symmetry. Your use of ordered pairs corresponds to the upper right triangle of my matrix x. Although this seems like just half as many values to compute, the syntactic and computational overhead becomes quite large. You are distinguishing situations where st is the first element in the pair from those where it is the second. Later on this leads to quite some trouble to get rid of that distinction. Having the full symmetric matrix, you don't have to worry about order, and things vectorize smoothly.

MvG
  • 57,380
  • 22
  • 148
  • 276