To simplify, I consider the one-output, two-input version of uniquetol
,
C = uniquetol(A, tol);
where the first input is a double
vector A
. In particular, this implies that:
- The
'ByRows'
option of uniquetol
is not used.
- The first input is a vector. If it were not,
uniquetol
would implicitly linearize to a column, as usual.
The second input, which defines the tolerance, is interpreted as follows:
Two values, u
and v
, are within tolerance if abs(u-v) <= tol*max(abs(A(:)))
That is, the specified tolerance is relative by default. The actual tolerance used in the comparisons is obtained by scaling by the maximum absolute value in A
.
With these considerations, it seems that the approach that uniquetol
uses is:
- Sort
A
.
- Pick the first entry of sorted
A
, and set this as reference value (this value will have to be updated later).
- Write the reference value into the output
C
.
- Skip subsequent entries of sorted
A
until one is found that is not within tolerance of the reference value. When that entry is found, take it as the new reference value and go back to step 3.
Of course, I'm not saying that this is what uniquetol
internally does. But the output seems to be the same. So this is functionally equivalent to what uniquetol
does.
The following code implements the approach described above (inefficient code, just to illustrate the point).
% Inputs A, tol
% Output C
tol_scaled = tol*max(abs(A(:))); % scale tolerance
C = []; % initiallize output. Will be extended
ref = NaN; % initiallize reference value to NaN. This will immediately cause
% A(1) to become the new reference
for a = sort(A(:)).';
if ~(a-ref <= tol_scaled)
ref = a;
C(end+1) = ref;
end
end
To verify this, let's generate some random data and compare the output of uniquetol
and of the above code:
clear
N = 1e3; % number of realizations
S = 1e5; % maximum input size
for n = 1:N;
% Generate inputs:
s = randi(S); % input size
A = (2*rand(1,S)-1) / rand; % random input of length S; positive and
% negative values; random scaling
tol = .1*rand; % random tolerance (relative). Change value .1 as desired
% Compute output:
tol_scaled = tol*max(abs(A(:))); % scale tolerance
C = []; % initiallize output. Will be extended
ref = NaN; % initiallize reference value to NaN. This will immediately cause
% A(1) to become the new reference
for a = sort(A(:)).';
if ~(a-ref <= tol_scaled)
ref = a;
C(end+1) = ref;
end
end
% Check if output is equal to that of uniquetol:
assert(isequal(C, uniquetol(A, tol)))
end
In all my tests this has run without the assertion failing.
So, in summary, uniquetol
seems to sort the input, pick its first entry, and keep skipping entries for as long as it can.
For the two examples in the question, the outputs are as follows. Note that the second input is specified as 2.5/9
, where 9
is the maximum of the first input, to achieve an absolute tolerance of 2.5
:
>> uniquetol([1 3 5 7 9], 2.5/9)
ans =
1 5 9
>> uniquetol([3 5 7 9], 2.5/9)
ans =
3 7