1

I am trying to join two spatial datasets. The first contains points and the second polygons.

However, some of the points are outside of the polygons.

Is there a simple way of joining/snapping these points to the nearest polygon boundary , not the nearest polygon centroid?

At the moment I am joining to the nearest polygon centroid but this does not yield the results I am looking for.

  • It is possible to decompose polygon into a set of line and find the nearest line to the point and give the line polygon as nearest polygon – BIg G Nov 18 '20 at 16:18
  • The correct solution can be found in [this other thread](https://stackoverflow.com/a/56398341/13078832). – npetrov937 Nov 05 '22 at 17:31

2 Answers2

1

You need to put all points (not polygon points into a KD-Tree) using something like the sklearn package. This package contains an efficient nearest neighbours calculation. In Python it can be imported using:

 import sklearn.neighbors as neighbors

If you have about 10 million polygons you only need a tree depth of 12 for it to be efficient. You can experiment with this. If less that 100,000 a leaf_size=9 might be enough. The code to put all points (in one single array) into a tree is done using the following:

 tree = neighbors.KDTree( arrayOfPoints, leaf_size=12 )

Then you iterate over each polygon and the individual points in each polygon to find the nearest 5 points (for instance). The algorithm is superquick at finding these because of the nature of the KDTree. Bruteforce comparison can be 1000 times slower (as I found for massive data sets).

 shortestDistances, closestIndices = tree.query( pointInPolygon, k=5 )

You might just want the nearest point, so you can set k=1 and then the closestIndices[0] is what you want for the actual array index from the point list.

Eamonn Kenny
  • 1,926
  • 18
  • 20
0

This is no complete answer, but you can check distances between points and polygons'boudaries using shapely :

from shapely.geometry import Point, Polygon
p = Point(0,0)
poly = Polygon([[1,0], [1,10], [10, 0]])
print(p.distance(poly))

Edit : So if you're not using big datasets, you could do something like :

my_points = [...]
my_polys = [...]

dict_point_poly = {}
for point in my_points:
  try:
    intersecting_poly = [poly for poly in my_polys if 
point.intersects(poly)][0]
    this_poly = intersecting_poly
  except IndexError:
    distances = [(poly, point.distance(poly)) for poly in my_polys]
    distances.sort(key=lambda x:x[1])
    this_poly = distances[0][0]
  finally:
    dict_point_poly[point] = this_poly

(not the most efficient method, but one easily understood I think)

tgrandje
  • 2,332
  • 11
  • 33