How to find all nearest values within a vector in MATLAB?

Question

I have a vector e.g. A=[2.30 2.32 2.67 2.44 2.31 1.23] I am interested to find all closest (almost equal) values with in this vector. The answer from above example should be index 1,2 and 5.

I don't know how to prescribe the tolerance, but the resulting values should be almost equal to each other. can any body provide a hint?

score 1 · Answer 1 · edited May 23 '17 at 11:44

I suggest the following approach:

%initialize A 
A=[2.30 2.32 2.67 2.44 2.31 1.23];

%initilize an epsilon parameter which defines how close 2 values should be to one another to considered identical.
EPSILON = 0.05; 

%generates all possible lists of pairs coordinates from A
[p,q] = meshgrid(1:n);
mask = logical(tril(ones(n,n))-eye(n,n));
allPairs = [p(mask),q(mask)];

%find pairs with absolute difference below epsilon
validPairs = abs(A(allPairs(:,1))- A(allPairs(:,2))) < EPSILON;

%result - pairs of numbers which are close to one another
allPairs(validPairs,:)

Result:

*The code for generating all possible pairs is taken from @Lambdageek solution

I could not guess the tolerance (EPSILON) in advance? May be different for another example. — erbal, Jul 01 '16 at 18:47

Bastian · Answer 2 · 2016-06-27T07:30:51.727

If you want to express distance in mathematical terms you can use the Euclidean Distance. Here is the expression:

If you have a higher dimensional space (which you have) you can get some information from Wikipedia. But it's still straight forward:

https://en.wikipedia.org/wiki/Euclidean_distance#n_dimensions

Since the Euclidean Distance is not the best distance measure in higher dimensional spaces, some people suggest the Cosine Similarity:

https://en.wikipedia.org/wiki/Cosine_similarity

You could also use an algorithm such as k-means or k-nearest-neighbors to solve this task.

If you are just looking for the most similar values in it:

Define a threshold. Let's say 0.01
Select the first element of the vector (xi, where i=0)
Select the first element which is not xi (xj, where j=i+1)
Compare xi with xj by, for example, dist = sqrt((xi - xj)^2). If dist is smaller or equal to your threshold, xi and xj are very
similar.
Increment xj and compare again
If xj is at the end of your vector, increment xi
Do this until you compared all elements.

score 0 · Answer 3 · answered Jun 27 '16 at 21:57

This approach does not need any defined absolute tolerance, instead a tolerance relative to smallest difference is needed. It always looks for the most close group in the data. In this form it will not work if you have exact duplicate values in your data, but you can easily extend it to handle that case nicely as well.

A=[2.30 2.32 2.67 2.44 2.31 1.23];
diffFactor=3;

Asorted=sort(A);
Adiff=abs(Asorted(1:end-1)-Asorted(2:end));
[minDiff,minInd]=min(Adiff);

commonValue=Asorted(minInd);

resultIndex=find(A>=commonValue-diffFactor*minDiff & A<=commonValue+diffFactor*minDiff)

If two values happens to be same in dataset, then this code will ignore all closer higher/lower values. And how to decide diffFactor? — erbal, Jul 01 '16 at 18:44

How to find all nearest values within a vector in MATLAB?

3 Answers3