3

Does anyone know of a way to merge thousands of polygons that are contiguous? I've been using turf's union function to do this in my prototypes but the time it takes grows to be way too slow as the list of polygons increases. I'm hoping/aiming for a solution that takes sub second time.

Here is how I've been doing it.

const turfUnion = require('@turf/union').default;

const polygons = [ ... ];

const result = polygons.merge((m, f) => turfUnion(m, f));

As I mentioned this is too slow. It takes close to 5 minutes to merge 10,000 features.

I'm guessing there is a way to do this much faster given that I know which polygons share a point with which other polygons and that all the polygons are contagious. The final result can have holes so the solution has to focus on interior perimeters as well as well as the external one.

Any ideas or open source solutions would be great. Solutions in Javascript are preferred, but other low level languages would be OK.

And here is a picture of one of large sets of polygons I'm looking to merge. The data for this can be found here.

input

And here is the expected output.

output

2 Answers2

4

Pairwise combine the polygons recursively The cost of computing the union of two polygons scales with the number of points in each. So you can reduce the runtime by reducing the number of operations that involve large polygons.

The approach I take is to combine polygon1 with polygon2, then polygon3 with polygon4, all the way up to polygon(N-1) with polygonN.

Then I repeat the process combining polygon1_2 (the union of polygons 1 and 2 from the previous step) with polygon3_4, all the way up to combining polygon(N-3)_(N-2) with polygon(N-1)_(N).

Keep repeating this process until you only have one polygon remaining.

Here's some sample code. It's python, not Javascript, but porting it shouldn't be difficult.

def union(geom_list):
    """Rapidly combine a list of geometries into one.
    
    Inputs
    ---------
    geom_list
        A list of gdal/ogr geometries
        
    Returns
    -----------
    A single geometry representing the union of the inputs
    """
    
    if len(geom_list) == 1:
        return geom_list[0]
    
    if len(geom_list) == 2:
        return geom_list[0].Union(geom_list[1])
        
    size = int(np.floor(len(geom_list)/2))
    geom1 = union(geom_list[:size])
    geom2 = union(geom_list[size:])
    return geom1.Union(geom2)

I can't claim this is the fastest possible way to do it, but it's much faster than adding one polygon at a time.

Fergal
  • 494
  • 3
  • 12
1

At the risk of sending you down a rabbit hole, here's what I would try if I was in your shoes...

Stage 1: O(n). Consolidate all the line segments into an array, such that you end up with an array of line segments (ie, [x0,y0,x1,y1]) representing every polygon...

[
  [30.798218847530226, -26.663920391848013, 30.798209281734188, -26.66394228167575],
  [30.798209281734188, -26.66394228167575, 30.798201318284743, -26.663914720621534],
  [30.798201318284743, -26.663914720621534, 30.798218847530226, -26.663920391848013],
...
]

Stage 2: O(n log n). Sort this entire array by x0, such that the line segments are now ordered according to the x value of the beginning of the segment.

Stage 3: O(1). Beginning with the first element in the sorted array (segment 0), we can make the assumption that the segment with the leftmost x0 value has to be on the edge of the outer polygon. At this point we have segment 0's [x0,y0,x1,y1] as the starting outer edge segment.

Stage 4: O(log n). Now, find the corrresponding line segments that begin with the end of the previous segment. In other words, which segments connect to the current segment? This should be less than a handful, typically one or two. Searching for the matching x0 is assumed to be binary, followed by a short localized linear search for all matching [x0,y0] combinations.

Stage 5: O(1). If only one segment's [x0,y0] matched the last segment's [x1,y1], then add the segment to the list of outer edges. If more than one matching segment was found, then (assuming that we're moving in a clockwise direction) find the [x0,y0] pair that is furthest left of the current line segment, or if the outer edge is taking a right turn and none of the matching segments is to the left, then the [x0,y0] pair that is closest to the right of the current line segment. Add this new segment to the list of outer edges.

(Note that there are matrix algorithms, which avoid the more expensive trig functions, to determine whether a point is to the left or right of a segment, in addition to the perpendicular distance from a point to a line / segment.)

Stage 6: O(~n). Repeat Stage 4 until back at the starting outer edge segment.

Overall algorithm should be O(n log n)...

Note that this does not take into account interior perimeters, but believe that if you can determine a beginning segment that forms part of an interior perimeter and know whether the starting segment is moving clockwise or counterclockwise, then the same algorithm should work...

Trentium
  • 3,419
  • 2
  • 12
  • 19