2

I have a set of points or coordinates like {(3,3), (3,4), (4,5), ...} and want to build a matrix with the minimum distance to this point set. Let me illustrate using a runnable example:

width = 10;
height = 10;

% Get min distance to those points
pts = [3 3; 3 4; 3 5; 2 4];

sumSPts = length(pts);

% Helper to determine element coordinates
[cols, rows] = meshgrid(1:width, 1:height);
PtCoords = cat(3, rows, cols);

AllDistances = zeros(height, width,sumSPts);

% To get Roh_I of evry pt
for k = 1:sumSPts

    % Get coordinates of current Scribble Point
    currPt = pts(k,:);

    % Get Row and Col diffs
    RowDiff = PtCoords(:,:,1) - currPt(1);
    ColDiff = PtCoords(:,:,2) - currPt(2);

    AllDistances(:,:,k) = sqrt(RowDiff.^2 + ColDiff.^2);

end

MinDistances = min(AllDistances, [], 3);

This code runs perfectly fine but I have to deal with matrix sizes of about 700 milion entries (height = 700, width = 500, sumSPts = 2k) and this slows down the calculation. Is there a better algorithm to speed things up?

pfuhlert
  • 533
  • 4
  • 16
  • check the diff function - it might help you – 16per9 May 10 '16 at 12:44
  • 1
    If you have the image processing toolbox, look into [`bwdist`](http://www.mathworks.com/help/images/ref/bwdist.html). With 700 million points, any solution is going to be slow. – Suever May 10 '16 at 13:26
  • Actually, you don't have to deal with matrices with 700M entries. You can slice up your pts vector, keep the min of the slices and then take the global min. Would a processing time of ~20s be good for you? – BillBokeey May 10 '16 at 14:01

1 Answers1

2

As stated in the comments, you don't necessary have to put everything into a huge matrix and deal with gigantic matrices. You can :

1. Slice the pts matrix into reasonably small slices (say of length 100)

2. Loop on the slices and calculate the Mindistances slice over these points

3. Take the global min


tic

Mindistances=[];

width = 500;
height = 700;

Np=2000;

pts = [randi(width,Np,1) randi(height,Np,1)];

SliceSize=100;

[Xcoords,Ycoords]=meshgrid(1:width,1:height);


% Compute the minima for the slices from 1 to floor(Np/SliceSize)
for i=1:floor(Np/SliceSize)

    % Calculate indexes of the next slice
    SliceIndexes=((i-1)*SliceSize+1):i*SliceSize


    % Get the corresponding points and reshape them to a vector along the 3rd dim.
    Xpts=reshape(pts(SliceIndexes,1),1,1,[]);
    Ypts=reshape(pts(SliceIndexes,2),1,1,[]);

    % Do all the diffs between your coordinates and your points using bsxfun singleton expansion
    Xdiffs=bsxfun(@minus,Xcoords,Xpts);
    Ydiffs=bsxfun(@minus,Ycoords,Ypts);

    % Calculate all the distances of the slice in one call
    Alldistances=bsxfun(@hypot,Xdiffs,Ydiffs);

    % Concatenate the mindistances
    Mindistances=cat(3,Mindistances,min(Alldistances,[],3));


end

% Check if last slice needed
if mod(Np,SliceSize)~=0

    % Get the corresponding points and reshape them to a vector along the 3rd dim.
    Xpts=reshape(pts(floor(Np/SliceSize)*SliceSize+1:end,1),1,1,[]);
    Ypts=reshape(pts(floor(Np/SliceSize)*SliceSize+1:end,2),1,1,[]);

    % Do all the diffs between your coordinates and your points using bsxfun singleton expansion
    Xdiffs=bsxfun(@minus,Xcoords,Xpts);
    Ydiffs=bsxfun(@minus,Ycoords,Ypts);

    % Calculate all the distances of the slice in one call
    Alldistances=bsxfun(@hypot,Xdiffs,Ydiffs);

    % Concatenate the mindistances
    Mindistances=cat(3,Mindistances,min(Alldistances,[],3));

end

% Get global minimum
Mindistances=min(Mindistances,[],3);

toc

Elapsed time is 9.830051 seconds.

Note :

You'll not end up doing less calculations. But It will be a lot less intensive for your memory (700M doubles takes 45Go in memory), thus speeding up the process (With the help of vectorizing aswell)


About bsxfun singleton expansion

One of the great strength of bsxfun is that you don't have to feed it matrices whose values are along the same dimensions.

For example :

Say I have two vectors X and Y defined as :

X=[1 2]; % row vector X
Y=[1;2]; % Column vector Y

And that I want a 2x2 matrix Z built as Z(i,j)=X(i)+Y(j) for 1<=i<=2 and 1<=j<=2.

Suppose you don't know about the existence of meshgrid (The example is a bit too simple), then you'll have to do :

Xs=repmat(X,2,1);
Ys=repmat(Y,1,2);
Z=Xs+Ys;

While with bsxfun you can just do :

Z=bsxfun(@plus,X,Y);

To calculate the value of Z(2,2) for example, bsxfun will automatically fetch the second value of X and Y and compute. This has the advantage of saving a lot of memory space (No need to define Xs and Ys in this example) and being faster with big matrices.

Bsxfun Vs Repmat

If you're interested with comparing the computational time between bsxfun and repmat, here are two excellent (word is not even strong enough) SO posts by Divakar :

Comparing BSXFUN and REPMAT

BSXFUN on memory efficiency with relational operations

Community
  • 1
  • 1
BillBokeey
  • 3,168
  • 14
  • 28
  • Thanks for your post! The solution does a pretty good job but fails when Np / SliceSize is not an integer (points get left out of the calculation). The "last" SliceIndexes with length(SliceIndexes) < Np/SliceSize is not calculated correctly. How good is the performance of "bsxfun(@minus,Xcoords,Xpts);" vs simply subtracting? – pfuhlert May 11 '16 at 12:55
  • You're welcome! Of course for the last slice, a simple fix will seal the deal. About the substraction, `bsxfun` doesn't just do that, it also gets rid of a call to repmat by using singleton expansion. I'm editing my answer with the fix. I'm editing my answer with the changes and a little explanation of singleton expansion – BillBokeey May 11 '16 at 13:00