Fast way to compute pairwise distances and the nearest neighbours in Matlab?

Question

I have a n by d matrix A, representing n d-dimensional points. I have another m by d matrix B, representing m d-dimensional points.

I wonder how to efficiently computer a m by n matrix, whose (i,j) element represents the Euclidean distance between the i-th row of matrix A and the j-th row of matrix B?
How shall I efficiently determine a vector of m elements, whose k-th element represents the row of A closest to the k-th row of B?

Note I know how to do the above two using loops. But in Matlab, it is not efficient to use loops, so I ask these questions.

Thanks!

You could be interested in my contribution http://stackoverflow.com/questions/23911670/efficiently-compute-pairwise-squared-euclidean-distance-in-matlab/23911671?noredirect=1 — matheburg, Jul 25 '14 at 10:58

score 2 · Answer 1 · answered Feb 29 '12 at 02:53

2

If you have enough RAM, an efficient way could be

[idxA,idxB] = ndgrid(1:n,1:m);

distMat = zeros(n,m);

distMat(:) = sqrt( sum((A(idxA,:) - B(idxB,:)).^2,2) );

You should definitely profile the two solutions, since the loop may be sufficiently optimized that the ndgrid solution is slower.

To find the row in A closest to the points in B, you can use min. Note that this will only give you one minimum distance for each point; if you need to identify ties, you have to use find.

[minDist,closestRowsInA] = min(distMat,[],1);

answered Feb 29 '12 at 02:53

Jonas

74,690
10
137
177

Thanks! +1. That indeed takes a lot of memory. My case is n=m=2000, and d=6. My laptop became frozen for a while. – Feb 29 '12 at 03:41
@Ethan: I, too, need to buy a new laptop. 8GB of RAM gets filled so quickly :) – Jonas Feb 29 '12 at 04:46
Do you have some alternative ways that balance the tradeoff of time and memory? – Feb 29 '12 at 12:33
@Ethan: In principle, you could make a loop over the number of rows in `idxA` where you take steps of, say, 100, and thus fill distMat in chunks. However, I don't have any intuition about how much that would increase speed. – Jonas Feb 29 '12 at 12:51
You should run possible solutions and compare the speed. I asked a silly question long ago, and many of us were surprised to find that the fastest method (for a subset of matrix sizes) had a loop. http://www.mathworks.com/matlabcentral/newsreader/view_thread/249428 – Prashant Kumar Mar 01 '12 at 06:22

Fast way to compute pairwise distances and the nearest neighbours in Matlab?

1 Answers1

Linked