1

I want to find the nearest column of a matrix with a vector.

Consider the matrix is D and the vector is y. I want an acceleration method for this function

function BOOLEAN = IsExsist(D,y)


[~, Ysize, ~] = size(D);
BOOLEAN = 0;
MIN = 1.5;

for i=1:Ysize 
    if(BOOLEAN == 1)
        break;
    end;
    if(norm(y - D(:,i),1) < MIN )
        BOOLEAN = 1;
    end;
end;

end
a d
  • 552
  • 2
  • 7
  • 17
  • Your code doesn't find the nearest column to a vector. I detects the first column that has distance lower than a given threshold. I'm answering to what the text says, not to what the code says – Luis Mendo May 25 '14 at 16:08

3 Answers3

2

I am assuming you are looking to "accelerate" this procedure. For the same, try this -

[~,nearest_column_number] = min(sum(abs(bsxfun(@minus,D,y))))

The above code uses 1-Norm (as used by you) along all the columns of D w.r.t. y. nearest_column_number is your desired output.

If you are interested in using a threshold MIN for the getting the first nearest column number, you may use the following code -

normvals = sum(abs(bsxfun(@minus,D,y)))
nearest_column_number = find(normvals<MIN,1)
BOOLEAN = ~isempty(nearest_column_number)

nearest_column_number and BOOLEAN are the outputs you might be interested in.

If you are looking to make a function out of it, just wrap in the above code into the function format you were using, as you already have the desired output from the code.

Edit 1: If you are using this procedure for a case with large D matrices with sizes like 9x88800, use this -

normvals = sum(abs(bsxfun(@minus,D,y)));
BOOLEAN = false;
for k = 1:numel(normvals)
    if normvals(k) < MIN
        BOOLEAN = true;
        break;
    end
end

Edit 2: It appears that you are calling this procedure/function a lot of times, which is the bottleneck here. So, my suggestion at this point would be to look into your calling function and see if you could reduce the number of calls, otherwise use your own code or try this slight modified version of it -

BOOLEAN = false;
for k = 1:numel(y)
    if norm(y - D(:,k),1) < MIN %// You can try replacing this with "if sum(abs(y - D(:,i),1)) < MIN" to see if it gives any performance improvement
        BOOLEAN = true;
        break;
    end
end
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • this code is very professional but it's slow like my code, – a d May 25 '14 at 13:17
  • @MehdiKhademloo What are your `whos D y` info? – Divakar May 25 '14 at 13:18
  • D is 9*88800 and y is 9*1 – a d May 25 '14 at 13:25
  • @MehdiKhademloo What runtimes are you getting with my code, just for this nearest column finding part? – Divakar May 25 '14 at 13:28
  • you know? I want to create this D for my dictionary, I think this 88800 columns is not important and many of these columns are similar. I want to use this code in my dictionary creation function. I mean when i want to add a column in dictionary, use this function for check "is there a column in dictionary like my new data?"... – a d May 25 '14 at 13:32
  • @MehdiKhademloo Please check out Edit 1 and let me know if that is any faster and if it works for you. – Divakar May 25 '14 at 13:34
  • I check it, and i told you: This is very very professional code, but it is a bit faster against my code. It stop when Size(D) arrives 9*1000... – a d May 25 '14 at 13:39
  • @MehdiKhademloo What is 9*1000...? How many times are you calling this function? – Divakar May 25 '14 at 15:28
1

To find the nearest column of a matrix D to a column vector y, with respect to 1-norm distance, you can use pdist2:

[~, index] = min(pdist2(y.',D.','minkowski',1));
Luis Mendo
  • 110,752
  • 13
  • 76
  • 147
0

What you are currently trying to do is optimize your Matlab implementation of linear search. No matter how much you optimize that it will always need to calculate all D=88800 distances over all d=9 dimensions for each search. Now that's easy to implement in Matlab as discussed in the other answer, but if you are planning to do many such searches, I would recommend to use a different data-structure and search-algorithm instead.

A good canditate would be (binary) space partioning which recursively splits your space into two parts along your dimensions. This adds quite some intial overhead to create the tree and makes insert- and remove-operations a bit more expensive. But as I understand your comments, searches are much more frequent and their execution will reduce in complexits from O(D) downto O(log(D)) which is a tremendous improvement for this problem size.

I think that there should be some usable Matlab-implementations of BSP around, e.g. on Mathworks file-exchange. But if you don't manage to find one, I'd be happy to provide some pointers as well.

Community
  • 1
  • 1
mbschenkel
  • 1,865
  • 1
  • 18
  • 40