Yup. With the mighty power of bsxfun
and permute
, with a side of sum
, and a dash of reshape
. This would be the first part, where you calculate the pair-wise distances between a point in S1
and another point in S2
:
out = reshape(sqrt(sum(bsxfun(@minus, S1, permute(S2, [3 2 1])).^2, 2)), size(S1,1), size(S2,1));
The last thing you need now is to determine the closest vector in S2
to each of S1
. That can be done using min
:
[dist,ind] = min(out, [], 2);
dist
would contain the smallest distance between a point in S1
with a point in S2
, and ind
would tell you which point that was.
This code looks very intimidating, but let's break it up into pieces.
permute(S2, [3 2 1])
: This takes the matrix S2
, which is a M x D
matrix and shuffles the dimensions so that it becomes a 1 x D x M
matrix.... now why would we want to do that? Let's move onto the next part and it'll make more sense.
bsxfun(@minus, S1, ...)
: bsxfun
stands for Binary Singleton EXpansion FUNction. What bsxfun
does is that if you have two inputs where either or both inputs has a singleton dimension, or if either of both inputs has only one dimension which has value of 1, each input is replicated in their singleton dimensions to match the size of the other input, and then an element-wise operation is applied to these inputs together to produce your output. In this case, I want to subtract these two newly formed inputs together.
As such, given that S1
is N x D
... or technically, this is N x D x 1
, and given that S2
is M x D
, which I permuted so that it becomes 1 x D x M
, we will create a new matrix that is N x D x M
long. The first input will duplicate itself as a 3D matrix where each slice is equal to S1
and that is N x D
. S2
is now a 3D matrix, but it is represented in such a way where each row in the original matrix is a slice in the 3D matrix where each slice consists of just a single row. This gets duplicated for N
rows.
We now apply the @minus
operation, and the effect of this is that for each output slice i
in this new matrix, this gives you the component wise difference between the point i
in S2
with all of the other points in S1
. For example, for slice #1, row #1 gives you the component wise differences between point #1 in S2
and point #1 in S1
. Row #2 gives you the component wise differences between point #1 in S2
and point #2 S1
, and so on.
sum((...).^2, 2)
: We want to find the Euclidean distance between one point and another point, so we sum these distances squared over each column independently. This results in a new 3D matrix where each slice contains N
values where there are N
distances for each of the M
points. For example, the first slice will give you the distances from point #1 in S2
with all of the other points in S1
.
out = reshape(..., size(S1,1), size(S2,1));
: We now reshape this so that it becomes a M x N
matrix so that each row and column pair of (i,j)
gives you the distances between points i
in S1
and j
in S2
, thus completing the calculations.
Doing [dist,ind] = min(out, [], 2);
determines the smallest distance between a point in S1
with the other points in S2
. dist
will give you the smallest distances while ind
will tell you which vector it is. Therefore, for each element in dist
, it gives you the smallest distance between a point i
in S1
and one of S2
, and ind
tells you which vector that was that belonged to S2
.
We can verify that this gives us correct results by using your proposed approach of looping through each pair of points and calculating the norm. Let's create S1
and S2
:
S1 = [1 2 3; 4 5 6; 7 8 9; 10 11 12];
S2 = [-1 -1 -1; 0 9 8];
More neatly displayed:
>> S1
S1 =
1 2 3
4 5 6
7 8 9
10 11 12
>> S2
S2 =
-1 -1 -1
0 9 8
Using the loop approach, we have this code:
out = zeros(size(S1,1), size(S2,1));
for ii = 1 : size(S1,1)
for jj = 1 :size(S2,1)
out(ii,jj) = norm(S1(ii,:) - S2(jj,:));
end
end
We get this matrix:
>> out
out =
5.3852 8.6603
10.4881 6.0000
15.6525 7.1414
20.8327 10.9545
Similarly, if we ran the code I wrote, we also get:
>> out = reshape(sqrt(sum(bsxfun(@minus, S1, permute(S2, [3 2 1])).^2, 2)), size(S1,1), size(S2,1))
out =
5.3852 8.6603
10.4881 6.0000
15.6525 7.1414
20.8327 10.9545
To complete the process, let's find the smallest distances and the corresponding vectors:
>> [dist,ind] = min(out, [], 2);
>> dist
dist =
5.3852
6.0000
7.1414
10.9545
>> ind
ind =
1
2
2
2
Therefore, for the first vector in S1
, the closest vector to this in S2
was first one, with a distance of 5.3852. Similarly, the second vector of S1
, the closest vector in S2
was the second one, with a distance of 6. You can repeat this for the other values and see that it's correct.