1

I have a n by d matrix A, representing n d-dimensional points. I have another m by d matrix B, representing m d-dimensional points.

  1. I wonder how to efficiently computer a m by n matrix, whose (i,j) element represents the Euclidean distance between the i-th row of matrix A and the j-th row of matrix B?
  2. How shall I efficiently determine a vector of m elements, whose k-th element represents the row of A closest to the k-th row of B?

Note I know how to do the above two using loops. But in Matlab, it is not efficient to use loops, so I ask these questions.

Thanks!

  • You could be interested in my contribution http://stackoverflow.com/questions/23911670/efficiently-compute-pairwise-squared-euclidean-distance-in-matlab/23911671?noredirect=1 – matheburg Jul 25 '14 at 10:58

1 Answers1

2

If you have enough RAM, an efficient way could be

[idxA,idxB] = ndgrid(1:n,1:m);

distMat = zeros(n,m);

distMat(:) = sqrt( sum((A(idxA,:) - B(idxB,:)).^2,2) );

You should definitely profile the two solutions, since the loop may be sufficiently optimized that the ndgrid solution is slower.

To find the row in A closest to the points in B, you can use min. Note that this will only give you one minimum distance for each point; if you need to identify ties, you have to use find.

[minDist,closestRowsInA] = min(distMat,[],1); 
Jonas
  • 74,690
  • 10
  • 137
  • 177
  • Thanks! +1. That indeed takes a lot of memory. My case is n=m=2000, and d=6. My laptop became frozen for a while. –  Feb 29 '12 at 03:41
  • @Ethan: I, too, need to buy a new laptop. 8GB of RAM gets filled so quickly :) – Jonas Feb 29 '12 at 04:46
  • Do you have some alternative ways that balance the tradeoff of time and memory? –  Feb 29 '12 at 12:33
  • @Ethan: In principle, you could make a loop over the number of rows in `idxA` where you take steps of, say, 100, and thus fill distMat in chunks. However, I don't have any intuition about how much that would increase speed. – Jonas Feb 29 '12 at 12:51
  • You should run possible solutions and compare the speed. I asked a silly question long ago, and many of us were surprised to find that the fastest method (for a subset of matrix sizes) had a loop. http://www.mathworks.com/matlabcentral/newsreader/view_thread/249428 – Prashant Kumar Mar 01 '12 at 06:22