2

I'm having trouble to speed up the following function and reading through various "How to vectorize in Matlab/Octave" unfortunatelly has not helped me on this specific topic.

Here is what I want to achieve: I have a set of 2d sample points (given as x-y-coordinate pairs) and a set of line segments (also x-y-coordinate pairs). The points are approximately near the lines and I want to get the distance from each of the sample points to the nearest line segment, BUT only if I can project the sample perpendicular onto the line segment.

So while I have managed to put the algorithm into a nested for-loop and it gives me the correct result for the appended example, it is incredibly slow (as expected) for real datasets (about 4000 sample points and 6000 line segments) and besides, looks really ugly.

Can anyone help me to create a more sophisticated version of this piece of code?

Edit: The mathematics used in this algorithm can be looked up here: http://mathworld.wolfram.com/Point-LineDistance2-Dimensional.html

clc;
clear all;
close all;

ptsx = [0.5 3 5];
ptsy= [1 -1.5 0.5];
points = [ptsx; ptsy];

linesx = [0 2 4 6];
linesy = [0 0 0 0];
lines = [linesx; linesy];

% for each point in the sample dataset
for k=1:1:length(points(1,:))
  clear 'distvec'
  % calculate the distance to each line segment in the model dataset
  for l=1:1:length(lines(1,:))-1

    % vector of the line segment
    a = [lines(1,l+1)-lines(1,l), lines(2,l+1)-lines(2,l)];
    % vector from the start of the line segment to the sample point
    b = [points(1,k) - lines(1,l), points(2,k) - lines(2,l)];

    % check if the sample point can be projected onto the line segment
    if norm(a) ~= 0
      lba = dot(a,b)/norm(a)^2;
    else
      lba = -1;
    end

    if (lba >= 0) && (lba <= 1)
      % calculate distance from sample point to single line segment
      x1 = [lines(1,l) lines(2,l)];
      x2 = [lines(1,l+1) lines(2,l+1)];
      x0 = [points(1,k) points(2,k)];
      dist = abs(det([x2-x1; x1-x0]))/norm(x2-x1);
      distvec(end+1) = dist;
    end
  end
  dist(end+1) = min(distvec);

end

figure;
hold on;
plot(ptsx, ptsy, 'bo');
plot(linesx, linesy, 'r-o');
xlim([-1 7]);
ylim([-2 2]);
  • the short answer is arrayfun/cellfun. Could you please check distvec and dist variables in your code? I got 'Undefined function or variable 'distvec'' error when running the example – Gryphon Sep 04 '17 at 15:42
  • 3
    @Gryphon Using `arrayfun`/`cellfun` is not vectorization – Sardar Usama Sep 04 '17 at 15:55
  • @ Gryphon: I would like to resolve the issue with "Undefined function or variable 'distvec'" error but I am not able to recreate this error. I am using Octave version 4.2.1 on Windows 7. I just copy/pasted the code from here into an empty file and it runs on my machin without complaining about `distvec`. – Olbert Dämmerer Sep 04 '17 at 16:12
  • 1
    You should clearly mention in your question that you're using Octave instead of stating MATLAB/Octave. However in MATLAB, to produce the similar (not exactly same) behavior, you'd use `distvec=[]` instead of `clear 'distvec'` – Sardar Usama Sep 04 '17 at 16:57
  • @SardarUsama that's not true is it? `clear` does the same thing in octave as in matlab. (plus `distvec=[]` doesn't get rid of the variable from the workspace) – Tasos Papastylianou Sep 04 '17 at 19:00
  • 1
    @SardarUsama oh I see what you're referring to. Interesting, I didn't realise you could initialise a variable in octave using the `end` keyword without it needing to exist in the workspace already. huh. – Tasos Papastylianou Sep 04 '17 at 19:02
  • I would suggest you post your problem in the Code review community of StackExchange. – Nicky Mattsson Sep 05 '17 at 06:20

2 Answers2

2

@Edit Variant with no loops but and most truly vectorize I can manage. At the moment I cannot test it (it requires r2016b and newer), but according to documentation it should work

ptsx = [3 0.5 5];
ptsy= [-1.5 1 0.5];
%order row is [x,y]
points = [ptsx; ptsy]';

linesx = [0 2 4 6];
linesy = [0 0 0 0];
lines = kron([linesx;linesy],[1 1]);
lines = lines(:,2:size(lines,2)-1);
%each row is [x1 y1 x2 y2]
lines=reshape(lines,4,[])';

lenx = lines(:,3) - lines(:,1);
leny = lines(:,4) - lines(:,2);
%remove degenerated lines
sq_norm = [lenx.*lenx + leny.*leny]';
rem_idx = sq_norm < eps;
sq_norm(rem_idx) = 1;

%Starting from r2016b no need of dirty hacks with bsxfun. Finally!
diff_startx = points(:,1) - lines(:,1)';
diff_starty = points(:,2) - lines(:,2)';

pos = (diff_startx .* lenx + diff_starty .* leny) ./ sq_norm;
pos(pos < 0) = 0;
pos(pos > 1) = 1;
dist = hypot(pos .* lenx - diff_startx, pos .* leny - diff_starty);

My variant with no loops

pts = [ 3   0.5 5;... %reordered for test purposes
       -1.5 1   0.5];
cpts = num2cell(pts,1);

lines = [0 2 4 6;...
         0 0 0 0];
%convert and split intercepts into two x-y pairs
lines = kron(lines,[1 1]);
lines = lines(:,2:size(lines,2)-1); %suggesting three intercepts
clines = mat2cell(lines, 2, repmat(2,1,size(lines,2)/2));

% [lines(1,2) - lines(1,1), lines(2,2) - lines(2,1)]
diff_lines = diff(lines,1,2); 
diff_lines = diff_lines(:,1:2:size(diff_lines,2));
diff_lines = num2cell(diff_lines,1);
% norm for each line
norms = cellfun(@(x) norm(x), diff_lines,'un',0);
idx = cellfun(@(x) x~=0, norms);

% [lines(1,2) - lines(1,1), lines(2,2) - lines(2,1)]
diff_start = cellfun(@(x,y) cellfun(@(z) {x(:,1)-z(:,1)},y),...
             cpts,num2cell(repmat(clines, numel(cpts),1),2)','un',0);
%choose corresponding to nonzero norm
lba = cellfun(@(x,y,n) cellfun(@(b) dot(x,b)/norm(n)^2, y, 'un', 0) ...
              ,diff_lines(idx),diff_start(idx),norms(idx),'un',0);
%find first matching lba for each point
idx_line = cellfun(@(x) find(cellfun(@(y) (y >= 0) & (y <= 1), x),1,'first'), lba);
%reorder intermediate results
clines = clines(idx_line);
norms = norms(idx_line);
diff_lines = diff_lines(idx_line);
% [lines(1,2) - pounts(1,1), lines(2,2) - pounts(2,1)]
diff_start = cellfun(@(x,y) y(:,1)-x(:,1),...
             cpts,clines,'un',0);
dists = cellfun(@(x,y,n) abs(det([x'; y']))/n, diff_lines, diff_start, norms)

Edits for TS code to work i MATLAB

lines = [linesx; linesy];
% Edit1: distance storage preallocation
distmat = nan(length(ptsx),length(linesx)-1);
for k=1:1:length(points(1,:))
...
dist = abs(det([x2-x1; x1-x0]))/norm(x2-x1);
% Edit 2: filling the storage
      distmat(k,l) = dist;
    end
...
% Edit 3: getting the distance
dists_vals = cellfun(@(x) min(x(~isnan(x))), num2cell(distmat,2))
Gryphon
  • 549
  • 3
  • 14
  • 3
    As noted in the comments cellfun/arrayfun is not vectorization. Additionally as repeatedly observed is using arrayfun actually slower than a standard for-loop. https://stackoverflow.com/questions/12522888/arrayfun-can-be-significantly-slower-than-an-explicit-loop-in-matlab-why – Nicky Mattsson Sep 05 '17 at 06:17
  • Thank you for your link. I knew than *fun are slow, but didn't notice they are _so_ slow. I thought in terms of C: using threads or MP brings benefit if single loop takes thousands processor ticks (microseconds) – Gryphon Sep 05 '17 at 10:06
2

Solution for GNU Octave, should also work in MATLAB. Please report if not.

ptsx = [0.5 3 5];
ptsy= [1 -1.5 0.5];
linesx = [0 2 4 6];
linesy = [0 0 0 0];

p = [ptsx;ptsy];
% start of lines
l = [linesx(1:end-1);linesy(1:end-1)];
% vector perpendicular on line
v = [diff(linesy);-diff(linesx)];
% make unit vector
v = v ./ hypot (v(1,:),v(2,:));
v = repmat (v, 1, 1, size (p, 2));
% vector from points (in third dimension) to start of lines (second dimension)
r = bsxfun (@minus, permute (p, [1 3 2]), l);
d = abs (dot (v, r));
dist = squeeze (min (d, [], 2))

gives

dist =

   1.00000
   1.50000
   0.50000
Andy
  • 7,931
  • 4
  • 25
  • 45
  • Thank you very much! This does exactly what I wanted and is incredibly faster than my version with the loop. How would I have to adjust it, if I want to calculate the distance to the start or the end of the line segment if the projection does not lay "within" the line segment. For example: `p = [4;2]` and `linesx = [0 2]` and `linesy = [0 0]` returns `dist = 2` but it should be the euclidean distance between the point and the end of the line segment? – Olbert Dämmerer Sep 06 '17 at 10:25
  • @OlbertDämmerer I don't understand the "within" part. What if you have to lines like "V" and a point between the upper two ends, to which line should this point be "mapped"? Perhaps you can create a handwritten sketch? – Andy Sep 06 '17 at 14:39
  • I have opened a new question because the original question has been answered. See [here](https://stackoverflow.com/questions/46079554/how-to-vectorize-this-operation-in-octave-calculate-distance-point-to-line-se) for addition explanation. – Olbert Dämmerer Sep 06 '17 at 15:53