0

I'm using kNN search function in matlab, but I'm calculating the distance between two objects of my own defined class, so I've written a new distance function. This is it:

         function d = allRepDistance(obj1, obj2)
         %calculates the min dist. between repr.
         % obj2 is a vector, to fit kNN function requirements

            n = size(obj2,1);
            d = zeros(n,1);
            for i=1:n
                    M =  dist(obj1.Repr, [obj2(i,:).Repr]');
                    d(i) = min(min(M));
            end

     end

The difference is that obj.Repr may be a matrix, and I want to calculate the minimal distance between all the rows of each argument. But even if obj1.Repr is just a vector, which gives essentially the normal euclidian distance between two vectors, the kNN function is slower by a factor of 200!

I've checked the performance of just the distance function (no kNN). I measured the time it takes to calculate the distance between a vector and the rows of a matrix (when they are in the object), and it work slower by a factor of 3 then the normal distance function.

Does that make any sense? Is there a solution?

Roy
  • 837
  • 1
  • 9
  • 22

1 Answers1

0

You are using dist(), which corresponds to the Euclidean distance weight function. However, you are not weighting your data, i.e. you don't consider that one dimension is more important that others. Thus, you can directly use the Euclidean distance pdist():

 function d = allRepDistance(obj1, obj2)
 % calculates the min dist. between repr.
 % obj2 is a vector, to fit kNN function requirements
    n = size(obj2,1);
    d = zeros(n,1);
    for i=1:n
        X = [obj1.Repr, obj2(i,:).Repr'];
        M = pdist(X,'euclidean');
        d(i) = min(min(M));
    end
end

BTW, I don't know your matrix dimensions, so you will need to deal with the concatenation of elements to create X correctly.

tashuhka
  • 5,028
  • 4
  • 45
  • 64
  • Thanks. I've tried it now with n=2000 and it is still slower by a factor of 60. Is it something inherentic when using matlab classes or is there something problematic with my code? – Roy Oct 14 '14 at 09:34
  • You can try to use profiling (`profile`) in your code or simply include `tic`/`roc` to see where your code is spending more time. Once you find the bottleneck, you can create another more specific question in SO. – tashuhka Oct 16 '14 at 21:30