Area of polygon calculation inconsistent

Question

I was using voronoi_finite_polygons_2d(vor, radius=None) function that I found elsewhere on StackOverflow.

I want to modify it to show the centroid of each voronoi cell. Debugging why some centroids appear dramatically wrong (see green arror pointing out centroid way off in the weeds). First error I identified: some of the calculations weren't processing the vertices in proper all-clockwise or all-counterclockwise order.

Not sure why some points don't get sorted correctly, but before I investigate that, I found another anomaly.

I should get the same area (with opposite sign) if I go clockwise or counterclockwise. In simple examples, I do. But in a random polygon I made, I get /slightly/ different results.

import numpy as np
import matplotlib.pyplot as plt
from scipy.spatial import Voronoi
import random
import math

def measure_polygon(vertices):
    xs = vertices[:,0]
    ys = vertices[:,1]
    xs = np.append(xs,xs[0])
    ys = np.append(ys,ys[0])

    #https://en.wikipedia.org/wiki/Centroid#Centroid_of_polygon
    area = sum(xs[i]*(ys[i+1]-ys[i-1]) for i in range(0, len(xs)-1))/2.0
    centroid_x =  sum((xs[i]+xs[i+1])*(xs[i]*ys[i+1] - xs[i+1]*ys[i]) for i in range(0, len(xs)-1))/(6.0*area)
    centroid_y =  sum((ys[i]+ys[i+1])*(xs[i]*ys[i+1] - xs[i+1]*ys[i]) for i in range(0, len(xs)-1))/(6.0*area)

    return (area, (centroid_x, centroid_y))

The first example work as expect -- same area and centroid regardless of processing order (cw or ccw).

d = [[0.0 ,  0.0], [1.0,3.0],[  5.0,3.0],[   4.0 ,  0.0] ] 
print len(d)

defects = [] 
defects.append([d[0], d[1], d[2], d[3]]) 
defects.append([d[3], d[2], d[1], d[0]])

for v in defects:
    print measure_polygon(np.array(v))

simple parallelogram output:

4 
(-12.0, (2.5, 1.5)) 
(12.0, (2.5, 1.5))

But now look at this 4-sided polygon (that is almost a triangle)

#original list of vertices
d = [[-148.35290745 ,  -1.95467472], [-124.93580616 ,  -2.09420039],[  -0.58281373,    1.32530292],[   8.77020932 ,  22.79390931] ]
print len(d)

defects = []
#cw
defects.append([d[0], d[2], d[3], d[1]])
#ccw
defects.append([d[1], d[3], d[2], d[0]])

for v in defects:
    print measure_polygon(np.array(v))

Gives me weird output:

4
(1280.4882517358433, (-36.609159411740798, 7.5961622623413145))
(-1278.8546083623708, (-36.655924939495335, 7.6058658049196115))

The areas are different. And if areas are different, then the centroids will be different. The discrepancies of area (1280 versus 1278) are so large that I doubt it's a floating point rounding thing. But other than that, I've run out of hypotheses why this isn't working.

===============================

I found the error.... my list-comprehension/indexing hack to enable y-1 and y+1 notation was broken (in a sinister way that half-worked). The correct routine is as follows:

def measure_polygon(vertices):
    xs = vertices[:,0]
    ys = vertices[:,1]

    #the first and last elements are for +1 -1 to work at end of range
    xs = vertices[-1:,0]
    xs = np.append(xs,vertices[:,0])
    xs = np.append(xs,vertices[:1,0])

    ys = vertices[-1:,1]
    ys = np.append(ys,vertices[:,1])
    ys = np.append(ys,vertices[:1,1])

    #for i in range(1, len(xs)-1):
    #    print ("digesting x, y+1, y-1 points: {0}/{1}/{2}".format(xs[i], ys[i+1], ys[i-1]))

    #https://en.wikipedia.org/wiki/Centroid#Centroid_of_polygon
    area = sum(xs[i]*(ys[i+1]-ys[i-1]) for i in range(1, len(xs)-1))/2.0
    centroid_x =  sum((xs[i]+xs[i+1])*(xs[i]*ys[i+1] - xs[i+1]*ys[i]) for i in range(1, len(xs)-1))/(6.0*area)
    centroid_y =  sum((ys[i]+ys[i+1])*(xs[i]*ys[i+1] - xs[i+1]*ys[i]) for i in range(1, len(xs)-1))/(6.0*area)

    return (area, (centroid_x, centroid_y))

So now NaN's example works right:

#NaN Example
d = [[3.0 ,  4], [5.0,11],[  12.0,8],[   9.0 ,  5],[5,6] ]
print "number of vertices: {0}".format(len(d))

defects = []
defects.append([d[0], d[1], d[2], d[3], d[4] ])
defects.append([ d[4], d[3], d[2], d[1], d[0]])

for v in defects:
    print measure_polygon(np.array(v))

results:

number of vertices: 5
(-30.0, (7.166666666666667, 7.6111111111111107))
(30.0, (7.166666666666667, 7.6111111111111107))

NaN · Answer 1 · 2016-11-06T05:47:46.570

Polygons have to be self closing so the first and last point are equal. This is pretty standard. You can use the Shoelace formula ( https://en.m.wikipedia.org/wiki/Shoelace_formula ) with the regular coordinates, but if I get a dataset missing the replicated last point, I just add it.. which makes the calculations easier. So consider a polygon without holes defined by the following coordinates (from the reference). Note the first and last point are the same...if they aren't you will get alignment errors for multipart polygons (polygons with holes for instance)

x = np.array([3,5,12,9,5,3]) # wikipedia
y= np.array([4,11,8,5,6,4])
a = np.array(list(zip(x,y)))
area1 = 0.5*np.abs(np.dot(x, np.roll(y, 1))-np.dot(y, np.roll(x, 1)))
area2 =0.5*np.abs(np.dot(x[1:], y[:-1])-np.dot(y[1:], x[:-1]))
print("\nroll area {}\nslice area{}".format(area1, area2))

yielding

roll area 30.0
slice area30.0

Now your polygon was treated the same, by adding the first point back in as the last point to give closure to the polygon

x = np.array([-148.35290745, -124.93580616, -0.58281373,  8.77029032, -148.35290745])
y = np.array([-1.95467472, -2.09420039,  1.32530292,  22.79390931, -1.95467472])
roll area  1619.5826480482792
slice area 1619.5826480482792

The area result differs from yours but I confirmed it using a third method using einsum. A portion of the script is below

def ein_area(a, b=None):
    """Area calculation, using einsum.
    :Requires:
    :--------
    :  a - either a 2D+ array of coordinates or an array of x values
    :  b - if a < 2D, then the y values need to be supplied
    :  Outer rings are ordered clockwise, inner holes are counter-clockwise
    :Notes:
    :  x => array([ 0.000,  0.000,  10.000,  10.000,  0.000])  .... OR ....
    :  t = x.reshape((1,) + x.shape)
    :      array([[ 0.000,  0.000,  10.000,  10.000,  0.000]]) .... OR .... 
    :  u = np.atleast_2d(x)
    :      array([[ 0.000,  0.000,  10.000,  10.000,  0.000]]) .... OR ....
    :  v = x[None, :]
    :      array([[ 0.000,  0.000,  10.000,  10.000,  0.000]])
    """
    a = np.array(a)  
    if b is None:
        xs = a[..., 0]
        ys = a[..., 1]
    else:
        xs, ys = a, b
    x0 = np.atleast_2d(xs[..., 1:])
    y0 = np.atleast_2d(ys[..., :-1])
    x1 = np.atleast_2d(xs[..., :-1])
    y1 = np.atleast_2d(ys[..., 1:])
    e0 = np.einsum('...ij,...ij->...i', x0, y0)
    e1 = np.einsum('...ij,...ij->...i', x1, y1)
    area = abs(np.sum((e0 - e1)*0.5))
    return area

But you can see that is largely based on the slicing/rolling approach. I would check to see if you can confirm results by including the last point which is normally missing from polygon lists but assumed.

Thanks, found an error in my indexing/list-comprehension. Btw, when you processed my example and got area of +/-1619, that that order of processing isn't going in pure CCW or CW order. When I do it in proper CW/CCW order I get +/-1270.1387323316385. When I describe polygon as you did, I also get +/-1619.5827808873739 — user3556757, Nov 06 '16 at 05:12
When working with geometry objects in my field, polygons always have the first and last point equal, and that includes the rings. This prevents 'bleed-through' between the inner and outer rings. It is nothing we have to do, it just is, so when I have to do manual demo's, it is second nature. This also prevents confusion. Four points are used to represent a polygon or a close-loop polyline. Three points can only represent a polyline. I have expanded ein_area for you to explore with nested rings. outer rings are CW, inner rings (ie form the hole) are CCW — NaN, Nov 06 '16 at 05:41
BTW... where did all the other examples go... there used to be 4 now there are 2! Is this normal? — NaN, Nov 06 '16 at 05:51

score 0 · Answer 2 · answered Nov 06 '16 at 05:12

0

The reason for that is it's missing the last point. It can be pretty much the same as the first one, a polygon has to be a circuit.

Databases usually omit it because it's standard to have it as the same as the first one.

answered Nov 06 '16 at 05:12

I leave the polygons defined as just a list of their vertices, and the area/centroid calculation was supposed to properly account for that need to have a closed circuit. Except, I screwed up my indexing. Once I fixed that, it works. – user3556757 Nov 06 '16 at 05:18

Area of polygon calculation inconsistent

2 Answers2