0

I have tried to write a function with a double loop, because I have to simulate a double sum. This code My code works but it is slow, how can I speed it up?

a & y are vectors, x is a matrix. They are all the same length, namely 100. X is 100x4 (100 rows, 4 col)

X1 <- matrix(rnorm(4*100), ncol=4)
y1 <- sign(X1[,1] + X1[,2] > 0)*2 - 1

fn <- function (a,x,y){
  dsum <-0
  for(i in 1:length(y)){
    for(j in 1:length(y)){
      dsum <- dsum + a[j]*a[i]*y[j]*y[i]*(t(x)[,j])%*%x[i,]
    }
  }
  res <- sum(a)-.5*dsum
  return (res)
}

I tried to do sum(a)-.5%*%sum(a%o%X%o%y*a%o%t(X)%o%y) But I am most certainly wrong.

Kevin
  • 3,077
  • 6
  • 31
  • 77
  • Care to give more details? Do `a` and `y` share the same length? What's the dimension of `x`? Can you explain which sum are you seeking? – nicola Mar 04 '16 at 07:38
  • They are all the same length, namely 100. X is 100x4 (100 rows, 4 col) – Kevin Mar 04 '16 at 07:40
  • 2
    [How to make a great R reproducible example?](http://stackoverflow.com/questions/5963269) – zx8754 Mar 04 '16 at 07:40
  • Please add any further details into your post, not in the comments. – zx8754 Mar 04 '16 at 07:42
  • the elements in `t(x)[,j]` are the same as in `t[j,]` – jogo Mar 04 '16 at 07:45
  • Possible duplicate of http://stackoverflow.com/questions/8691966/vectorize-speed-up-code-with-nested-for-loops?rq=1 and http://stackoverflow.com/questions/29054459/how-to-speed-up-or-vectorize-a-for-loop?rq=1 and http://stackoverflow.com/questions/8701259/use-of-xapply-to-speed-up-loops?rq=1 – symbolrush Mar 04 '16 at 07:48

2 Answers2

5

Try this:

fn2<-function(a,x,y) {
   one<-outer(a*y,a*y)
   two<-rowSums(x[rep(1:nrow(x),nrow(x)),]*x[rep(1:nrow(x),each=nrow(x)),])
   sum(a)-.5*sum(one*two)
}

An example:

set.seed(1)
a<-runif(100)
y<-runif(100)
x<-matrix(runif(400),nrow=100)
fn(a,x,y)
#[1,] -303.6947
fn2(a,x,y)
#[1] -303.6947
nicola
  • 24,005
  • 3
  • 35
  • 56
2
dsum <- sum(tcrossprod(x*(a*y), x*(a*y)))

gives the same result as the calculation in the loops. The short version is:

dsum <- sum(tcrossprod(x*(a*y)))

I suppose that this solution is fast. If you want to put it in your function you can do:

fn <- function (a,x,y) sum(a) - 0.5*sum(tcrossprod(x*(a*y)))

Example:

x <- matrix(1, 6, 2)
a <- 1:6
y <- 11:16

dsum <- 0
for(i in 1:length(y)) {
  for(j in 1:length(y)) {
    dsum <- dsum + a[j]*a[i]*y[j]*y[i]*(t(x)[,j])%*%x[i,]
  }
}
dsum

sum(tcrossprod(x*(a*y), x*(a*y)))
sum(tcrossprod(x*(a*y)))
jogo
  • 12,469
  • 11
  • 37
  • 42
  • 1
    With MRO or any distribution linked to a good BLAS, this _flies_. Almost two orders of magnitude faster than the original, and one faster than the other answer, according to `microbenchmark` on my laptop. – alistaire Mar 04 '16 at 08:18
  • I "rewrote" in a sense the matrix product in the second line of my function and realized later that a simple product would have been enough. OP should accept this answer and not mine. +1 of course. – nicola Mar 04 '16 at 09:34
  • @nicola Thanks for your comment. I need some minutes more to analyse the algebra. – jogo Mar 05 '16 at 17:22