Condense a set of points of a polygon into a shorter set of points

Question

I have the following polygon which is just a set of 2D points as follows:-

poly0=[80    60
    90    60
   100    60
   110    60
   110    50
   120    50
   130    50
   140    50
   150    50
   160    50
   170    50
   180    50
   190    50
   200    50
   210    50
   210    60
   210    70
   210    80
   210    90
   220    90
   220   100
   210   100
   210   110
   200   110
   200   120
   190   120
   180   120
   180   130
   170   130
   160   130
   150   130
   140   130
   130   130
   130   120
   120   120
   110   120
   110   110
   100   110
   100   100
    90   100
    90    90
    90    80
    90    70
    80    70
    80    60];

Now I can plot it using.

>> line(poly0(:,1), poly0(:,2),'Color','k','LineWidth',3,'LineStyle',':');

This clearly shows one thing that my original set of polygon points is highly redundant. Basically, multiple points lying on the same straight line are enumerated above which is not needed. I could start checking each pair of points and if they are on the same straight line I could remove them. But that would mean using many for loops. I cannot come up with a smart vectorized way.

How do I get a new set of points which is much shorter in size than the previous but which still represents the exact same polygon? I should only have as many points as there are vertices in the polygon. So in other words how to find the vertices from the above data set in a quick way?

PS: Here the vertex angles are 90 degrees, but if you give a solution don't try to exploit that fact. I want a more general answer.

It's just one loop: for each 3 consecutive points, if they are on a straight line, remove the middle one. If you remove a point, check again with the same first point, adding a new 3rd point. After the loop, you also need to check with points `end-1`, `end` and `1`, and with `end`, `1`, and `2`. — Cris Luengo, Apr 09 '19 at 22:45
Potential duplicate of https://stackoverflow.com/questions/53839733/remove-redundant-points-on-plot/53840350#53840350 — Durkee, Apr 09 '19 at 22:54
@Durkee Ha, Works! But I think it exploits the fact that the polygonal angles are 90 deg. please see PS above. — user_1_1_1, Apr 09 '19 at 23:00
@CrisLuengo I guess your answer is the most general answer that works. I will accept it if it is made into an answer with some codes. — user_1_1_1, Apr 09 '19 at 23:03

Cris Luengo · Accepted Answer · 2019-04-10T04:05:53.380

The two existing answers have large shortcomings:

Durkee's method only works if the distances between subsequent points is exactly identical. Points must have coordinates perfectly representable as floating-point values, so that the distances between subsequent points on a line can be found to be identical. If points are not equidistant, the method does nothing. Also, the start and end of the polygon are not examined together, so if a straight line is formed across the start/end of the polygon, one point too many will remain.
ShadowMan's method is better in that distances do not need to be identical, and the line across the start/end of the polygon is correctly handled. However it also uses floating point equality comparisons, which will not work in general. Only with integer coordinates will this method work correctly. Furthermore it uses vecnorm (which does a square root) and a division, both are relatively expensive operations (when compared to the method shown here).

To see if three points form a straight line, a simple arithmetic rule can be used. Let's say we have points p0, p1 and p2. The vector from p0 to p1 and the vector from p0 to p2 form the basis of a parallelogram, whose area can be computed by the cross product of the two vectors (in 2D, the cross product is understood to use z=0, with the resulting vector having x=0 and y=0, only the z value is useful; thus, the 2D cross product we assume to produce a scalar value). It can be computed as follows:

v1 = p1 - p0;
v2 = p2 - p0;
x = v1(1)*v2(2) - v1(2)*v2(1);

x, the cross product, will be zero if the two vectors are parallel, meaning that the three points are collinear. But the test for equality to 0 must have some tolerance, since floating-point arithmetic is inexact. I use 1e-6 as a tolerance here. Use a value that is several orders of magnitude smaller than the distance between your points.

Given an input set of points p, we can find the corner points with:

p1 = p;                                  % point 1
p0 = circshift(p1,1);                    % point 0
v1 = p1 - p0;                            % vector from point 0 to 1
v2 = circshift(p1,-1) - p0;              % vector from point 0 to 2
x = v1(:,1).*v2(:,2) - v1(:,2).*v2(:,1); % cross product
idx = abs(x) > 1e-6;                     % comparison with tolerance
p = p(idx,:);                            % corner points

Do note that this cross product test will fail if two consecutive points have identical coordinates (i.e. one of the vectors has a zero length). An additional test would be necessary if the data could have duplicated points.

Here are the results of the three methods. I've created a polygon with non-trivial coordinates and not equally spaced vertices. I've also put the start/end gap in the middle of a straight edge. These characteristics are purposeful to show the shortcomings of the other two methods.

This is the code I used to produce the graph:

% Make a polygon that will be difficult for the other two methods
p = [0,0 ; 0.5,0 ; 1,0 ; 1,1 ; 0.5,1 ; 0,1];
p = p + rand(size(p))/3;
p(end+1,:) = p(1,:);
q = [];
for ii = 1:size(p,1)-1
   t = p(ii,:) + (p(ii+1,:) - p(ii,:)) .* [0;0.1;1/3;0.45;0.5897545;pi/4;exp(1)/3];
   q = [q;t];
end
q = circshift(q,3,1);

figure
subplot(2,2,1)
plot(q(:,1),q(:,2),'bo-')
axis equal
title('input')

subplot(2,2,2)
res1 = method1(q);
plot(res1(:,1),res1(:,2),'ro-')
axis equal
title('Durkee''s method')

subplot(2,2,3)
res2 = method2(q);
plot(res2(:,1),res2(:,2),'ro-')
axis equal
title('ShadowMan''s method')

subplot(2,2,4)
res3 = method3(q);
plot(res3(:,1),res3(:,2),'go-')
axis equal
title('correct method')

% Durkee's method: https://stackoverflow.com/a/55603145/7328782
function P = method1(P)
a = logical([1 diff(P(:,1),2)' 1]);
b = logical([1 diff(P(:,2),2)' 1]);
idx = or(a,b);
P = P(idx,:);
end

% ShadowMan's method: https://stackoverflow.com/a/55603040/7328782
function corners = method2(poly0)
poly0Z = circshift(poly0,1);
poly0I = circshift(poly0,-1);
unitVectIn =(poly0 - poly0I)./vecnorm((poly0 - poly0I),2,2);
unitVectOut =(poly0Z - poly0)./vecnorm((poly0Z - poly0),2,2);
cornerIndices = sum(unitVectIn == unitVectOut,2)==0;
corners = poly0(cornerIndices,:);
end
% vecnorm is new to R2017b, I'm still running R2017a.
function p = vecnorm(p,n,d)
% n is always 2
p = sqrt(sum(p.^2,d));
end

function p = method3(p1)
p0 = circshift(p1,1);
v1 = p1 - p0;
v2 = circshift(p1,-1) - p0;
x = v1(:,1).*v2(:,2) - v1(:,2).*v2(:,1);
idx = abs(x) > 1e-6;
p = p1(idx,:);
end

Chris, thanks for the feedback. I attempted to implement tolerance into my script using your jittery points. `(abs(unitVectIn-unitVectOut)<1e-6` it doesn't quite do the trick because it leaves me with a vector. Would `sum((abs(unitVectIn-unitVectOut)<1e-6) == 2 ` be appropriate because now instead of comparing floats I am just comparing integers that are a result from the first boolean? I would like to learn more about the principles behind your critique if you know any good resources. — ShadowMan, Apr 10 '19 at 15:27
@ShadowMan: Yes, that is what I meant. It is about replacing `a==b` with `abs(a-b) — Cris Luengo, Apr 10 '19 at 15:32
is there any reason you opted to calculate the cross product manually instead of using `cross()`? — ShadowMan, Apr 11 '19 at 19:06
@ShadowMan: `cross` requires 3D vectors. I'd have to add a column of zeros (for the 3rd dimension), and it then would compute the x and y values which I already know will be zero. So that would be less efficient. I don't know if MATLAB has a function to compute the 2D cross product. Also, I modified the code from C++ code that I already had, I did take the effort to replace the loop using `circshift` that I saw in your answer, but didn't look to see if anything else could be simplified. So leaving this manual cross product in was the easy solution. :) — Cris Luengo, Apr 11 '19 at 19:22
Thanks for the tips here, correcting the float comparison in my script leads to good results even on the 'jittery' data. I will always be on the lookout for this problem with my code from here forward :D — ShadowMan, Apr 11 '19 at 19:45
@ShadowMan: If you do that, you're already ahead of the curve! :) -- BTW: [this is our standard dupe-target for floating point comparison errors](https://stackoverflow.com/q/686439/7328782). Look at the linked questions, you'll see how common this error is. Those are just the ones in the MATLAB tag, I'm sure there are similar dupe targets for other languages. — Cris Luengo, Apr 11 '19 at 19:57

ShadowMan · Answer 2 · 2019-04-10T01:21:02.670

The 'vector' way can be done quite elegantly. I tried a for loop way too and you can do the same thing that way but you asked for vector so here is my way of doing that.

The only change I made to your data was to remove any replicates prior to starting this script. Also, the points provided should be in sequential order going either clockwise or counter clockwise.

    poly0Z = circshift(poly0,1);
    poly0I = circshift(poly0,-1); 
    unitVectIn =(poly0 - poly0I)./vecnorm((poly0 - poly0I),2,2);
    unitVectOut =(poly0Z - poly0)./vecnorm((poly0Z - poly0),2,2)  ;
    cornerIndices = sum(unitVectIn == unitVectOut,2)==0
    corners = poly0(cornerIndices,:)

    line(poly0(:,1), poly0(:,2),'Color','k','LineWidth',2,'LineStyle',':');
    hold on
    scatter(corners(:,1), corners(:,2),'filled')

The basis of this method is to go to each point, compute the unit vector coming in, and the unit vector going out. Points where the unit vector in does not match the unit vector out are corners.

This method works well, except for the equality comparison. You should allow for a tolerance when comparing normalized vectors (`abs(unitVectIn-unitVectOut)<1e-6` or similar). — Cris Luengo, Apr 10 '19 at 03:58

score 1 · Answer 3 · answered Apr 10 '19 at 00:39

1

Okay, I adapted this to deal with non-square corners.

Consider a triangle identified by the points

P = [0 0; 1 0; 2 0; 1.5 1; 1 2; .5 1; 0 0];

This is a 7x2 array, if we then define the 2 derivative vectors as defined by the question I mentioned in the comments.

a = logical([1 diff(P(:,1),2)' 1]);
b = logical([1 diff(P(:,2),2)' 1]);

From there, we may combine the two to get a new indexing variable

idx = or(a,b);

Finally, we may use this to make our plot

line(P(idx,1), P(idx,2),'Color','k','LineWidth',3,'LineStyle',':');

If you're doing a line plot, I think you need to set the last variable to false.

idx(end) = false;

answered Apr 10 '19 at 00:39

Durkee

778
6
14

This way is much faster to the solution than mine computationally, as well as time to write the code :D. – ShadowMan Apr 10 '19 at 01:37
If the first and last points form a straight line you don’t remove enough points. The other answer does do that with the circshift, I think. – Cris Luengo Apr 10 '19 at 01:58
True, depending on the situation it might not matter. Alternatively, it can be fixed with an if statement. – Durkee Apr 10 '19 at 02:20
I just realized the method has another shortcoming. You assume vertices are equally spaced. If points on a straight line are not at equal distances, they remain in the output. – Cris Luengo Apr 10 '19 at 03:57

Condense a set of points of a polygon into a shorter set of points

3 Answers3