Is there a way to calculate the following specified matrix by avoiding loops? in R or Matlab

Question

I have an N-by-M matrix X, and I need to calculate an N-by-N matrix Y:

Y[i, j] = sum((X[i,] - X[j,]) ^ 2)   0 <= i,j <= N

For now, I have to use nested loops to do it with O(n²). I would like to know if there's a better way, like using matrix operations.

more generally, sum(....) can be a function, fun(x1,x 2) of which x1, x2 are M-by-1 vectors.

For the MATLAB part, these might be of intereset: [How do I create a simliarity matrix in MATLAB?](http://stackoverflow.com/a/4171845/97160), [Matlab formula optimization](http://stackoverflow.com/a/6989907/97160) — Amro, Jun 16 '12 at 20:08

Luka Rahne · Answer 1 · 2012-06-15T23:49:19.080

2

(x[i]-x[j])^2 = x[i]² - 2*x[i]*x[j] + x[j]²

and than is middle part just matrix multiplication -2*X*tran(X) (matrix) and other parts are just vetrors and you have to run this over each element

This has O(n^2.7) or whatever matrix multiplication complexity is

Pseudocode:

vec=sum(X,rows).^2
Y=X * tran(X) * -2 
for index [i,j] in Y:
   Y[i,j] =  Y[i,j] + vec[i]+vec[y]

edited Jun 15 '12 at 23:49

answered Jun 15 '12 at 23:26

Luka Rahne

10,336
3
34
56

score 2 · Answer 2 · answered Jun 15 '12 at 23:34

2

you can use expand.grid to get a data.frame of possible pairs:

X <- matrix(sample(1:5, 50, replace=TRUE), nrow=10)

row.ind <- expand.grid(1:dim(X)[1], 1:dim(X)[2])

Then apply along each pair using a function:

myfun <- function(n) {
  sum((X[row.ind[n, 1],] - X[row.ind[n, 2],])^2)
}

Y <- matrix(unlist(lapply(1:nrow(row.ind), myfun)), byrow=TRUE, nrow=nrow(X))


> Y
      [,1] [,2] [,3] [,4] [,5]
 [1,]    0   28   15   31   41
 [2,]   31   28   33   30   33
 [3,]   28    0   15    7   19
 [4,]   33   30   19   34   11
 [5,]   15   15    0   12   22
 [6,]   10   19   10   21   20
 [7,]   31    7   12    0    4
 [8,]   16   17   16   13    2
 [9,]   41   19   22    4    0
[10,]   14   11   28    9    2
>

I bet there is a better way but its Friday and I'm tired!

answered Jun 15 '12 at 23:34

Justin

42,475
9
93
111

@ralu I wans't thinking about complexity, but you can substitute whatever you'd like in the function to simplify the matrix math, but this avoids a looping construct and is fairly straight forward to multi-thread w/ `mclapply` if desired. And, like I said, its Friday! – Justin Jun 15 '12 at 23:39
Maybe that better way is to use `dist()` since we're just computing the Euclidean distance between pairs of rows. This would be the R equivalent of Ansari's "cheating" Matlab answer. – joran Jun 16 '12 at 02:17
But the OP asked for a general solution where the `sum(...)` piece could be anything. – Justin Jun 16 '12 at 02:29
yes, a general solution is what I wanted. I am coding a SVM function, the matrix is the Dmat for quadprog in R, and the the sum... piece is the kernel function. – Fivesheep Jun 16 '12 at 05:01

Ansari · Answer 3 · 2012-06-16T00:13:16.007

In MATLAB, for your specific f, you could just do this:

Y = pdist(X).^2;

For a non-"cheating" version, try something like this (MATLAB):

[N, M] = size(X);
f = @(u, v) sum((u-v).^2);
helpf = @(i, j) f(X(i, :), X(j, :))
Y = arrayfun(helpf, meshgrid(1:N, 1:N), meshgrid(1:N, 1:N)');

There are more efficient ways of doing it with the specific function sum(...) but your question said you wanted a general way for a general function f. In general this operation will be O(n^2) times the complexity of each vector pair operation because that's how many operations need to be done. If f is of a special form, some calculations' results can be reused.

Is there a way to calculate the following specified matrix by avoiding loops? in R or Matlab

3 Answers3

Linked