52

I have a large number of polygons (~100000) and try to find a smart way of calculating their intersecting area with a regular grid cells.

Currently, I am creating the polygons and the grid cells using shapely (based on their corner coordinates). Then, using a simple for-loop I go through each polygon and compare it to nearby grid cells.

Just a small example to illustrate the polygons/grid cells.

from shapely.geometry import box, Polygon
# Example polygon 
xy = [[130.21001, 27.200001], [129.52, 27.34], [129.45, 27.1], [130.13, 26.950001]]
polygon_shape = Polygon(xy)
# Example grid cell
gridcell_shape = box(129.5, -27.0, 129.75, 27.25)
# The intersection
polygon_shape.intersection(gridcell_shape).area

(BTW: the grid cells have the dimensions 0.25x0.25 and the polygons 1x1 at max)

Actually this is quite fast for an individual polygon/grid cell combo with around 0.003 seconds. However, running this code on a huge amount of polygons (each one could intersect dozens of grid cells) takes around 15+ minutes (up to 30+ min depending on the number of intersecting grid cells) on my machine which is not acceptable. Unfortunately, I have no idea how it is possible to write a code for polygon intersection to get the area of overlap. Do you have any tips? Is there an alternative to shapely?

HyperCube
  • 3,870
  • 9
  • 41
  • 53
  • 1
    I'm curious how you are looping and intersecting your polygons. Can you show more code on the process? It would be easier to figure out how this can be optimized. – tdedecko Feb 05 '13 at 00:30
  • I basically take an array of lat/lon corner values and convert them in a for loop to the polygons. Then, I compare each polygon to certain grid cell, which is done in a for-loop again. See this: http://stackoverflow.com/a/13956110/1740928 – HyperCube Feb 05 '13 at 08:35

3 Answers3

78

Consider using Rtree to help identify which grid cells that a polygon may intersect. This way, you can remove the for loop used with the array of lat/lons, which is probably the slow part.

Structure your code something like this:

from shapely.ops import cascaded_union
from rtree import index
idx = index.Index()

# Populate R-tree index with bounds of grid cells
for pos, cell in enumerate(grid_cells):
    # assuming cell is a shapely object
    idx.insert(pos, cell.bounds)

# Loop through each Shapely polygon
for poly in polygons:
    # Merge cells that have overlapping bounding boxes
    merged_cells = cascaded_union([grid_cells[pos] for pos in idx.intersection(poly.bounds)])
    # Now do actual intersection
    print(poly.intersection(merged_cells).area)
Mike T
  • 41,085
  • 18
  • 152
  • 203
  • 7
    This remains an incredibly helpful answer - it should have been accepted. I had a similar problem and `Rtree` made the algorithm run around 5000 times faster. – Gabriel Jan 09 '16 at 11:13
  • 3
    Note that `Rtree` can be only used for boxes (4 points), not complex polygons. – Ikar Pohorský Feb 15 '17 at 10:58
  • For "real" polygons just add a proper `actual geometry intersects?` check for each bounds intersection. The rtree let's you reduce the search space and things are super fast. – bugmenot123 May 16 '18 at 11:09
  • While an old answer, this is the result that i found to remind me what object.. rtree.. If you have a complex poly, just envelope it and then check against that first before the complex one.. Will help speed things up again.+1 for simple example – Angry 84 Aug 30 '18 at 06:33
  • 1
    BTW, shapely has an R-tree implementation, as noted in the answer below: https://shapely.readthedocs.io/en/stable/manual.html#str-packed-r-tree – chris Jun 11 '19 at 19:51
32

Since 2013/2014 Shapely has STRtree. I have used it and it seems to work well.

Here is a snippet from the docstring:

STRtree is an R-tree that is created using the Sort-Tile-Recursive algorithm. STRtree takes a sequence of geometry objects as initialization parameter. After initialization the query method can be used to make a spatial query over those objects.

>>> from shapely.geometry import Polygon
>>> from shapely.strtree import STRtree
>>> polys = [Polygon(((0, 0), (1, 0), (1, 1))), Polygon(((0, 1), (0, 0), (1, 0))), Polygon(((100, 100), (101, 100), (101, 101)))]
>>> s = STRtree(polys)
>>> query_geom = Polygon([(-1, -1), (2, 0), (2, 2), (-1, 2)])
>>> result = s.query(query_geom)
>>> polys[0] in result
True
Georgy
  • 12,464
  • 7
  • 65
  • 73
Phil
  • 5,822
  • 2
  • 31
  • 60
  • 2
    This is so helpfull. Do you know if the STRtree can be serialized with pickle or marshall libraries to save it for later use? – eguaio Aug 30 '17 at 22:25
  • 1
    No, I am not familiar with the serialization capabilities of the STRtree. I believe it is completely dependent on the serialization of the _tree_handle returned by ``shapely.geos.GEOSSTRtree_create(max(2, len(geoms)))`` – Phil Sep 15 '17 at 23:38
0

Here is another version of the answer but with more control over IoU

```
    def merge_intersecting_polygons(list_of_polygons, image_width, image_height):
    """
    merge intersecting polygons with shapely library
    merge only if Intersection over Union is greater than 0.5
    speed up with STRTree
    """
    # create shapely polygons
    shapely_polygons = []
    for polygon in list_of_polygons:
        shapely_polygons.append(Polygon(polygon))
    # create STRTree
    tree = STRtree(shapely_polygons)
    # merge polygons
    merged_polygons = []
    for i, polygon in enumerate(shapely_polygons):
        # find intersecting polygons
        intersecting_polygons = tree.query(polygon)
        # merge intersecting polygons
        for intersecting_polygon in intersecting_polygons:
            if polygon != intersecting_polygon:
                # compute intersection over union
                intersection = polygon.intersection(intersecting_polygon).area
                union = polygon.union(intersecting_polygon).area
                iou = intersection/union
                if iou > 0.5:
                    # merge polygons
                    polygon = polygon.union(intersecting_polygon)
        # add merged polygon to list
        merged_polygons.append(polygon)
    # remove duplicates
    merged_polygons = list(set(merged_polygons))
    # convert shapely polygons to list of polygons
    list_of_polygons = []
    for polygon in merged_polygons:
        list_of_polygons.append(np.array(polygon.exterior.coords).tolist())
    return list_of_polygons

```
Achyut Sarma
  • 119
  • 7