0

I have a MultiPolygon that represents a road and would like to find whether some GPS points fall within x distance from the road. My geo_buf below is road.buffer(x). Using repeated geo_buf.contains(Point) is very slow, as shown in the profiling below (most of the time is spent running line 297).

How can i optimize the speed?

from line_profiler import LineProfiler
from shapely.geometry import Point as shapely_Point

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   151                                           def filter_gps(gps_row, geo_buf):
   152    606446   62042960.0    102.3     83.3      pot = shapely_Point(gps_row['longitude'], gps_row['latitude'])
   153    606446   12433530.0     20.5     16.7      return geo_buf.contains(pot)



Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================

  294      1232      11850.0      9.6       0.0   if len(df_gps.index) > 1:
  295                                               geo_buf = shape(json.loads(srg_row['srg_buf']))
  296                                               # filter the GPS points
  297      1232   98465688.0  79923.4     68.4      df_filter = df_gps[df_gps.apply(lambda row: filter_gps(row, geo_buf), axis=1)]
iamanigeeit
  • 784
  • 1
  • 6
  • 11
TinyAnt
  • 1
  • 1

1 Answers1

0

These may be helpful:

Is there way to optimize speed of shapely.geometry.shape.contains(a_point) call?

https://gis.stackexchange.com/questions/102933/more-efficient-spatial-join-in-python-without-qgis-arcgis-postgis-etc/165413#165413

(Not tested) I believe the fastest way to do this is to split your Polygon into many smaller Polygons then use geopandas.tools.sjoin.

iamanigeeit
  • 784
  • 1
  • 6
  • 11
  • geo_buf is of the same type as patch. '''from shapely.geometry import Point patch = Point(0.0, 0.0).buffer(10.0) patch ''' – TinyAnt Nov 28 '18 at 01:34
  • Ok, now i understand your problem. Have sent question edits for peer review. – iamanigeeit Nov 29 '18 at 02:47
  • If `geo_buf` is a circle then you should just calculate the haversine distance and filter instead of creating Point objects. See https://stackoverflow.com/questions/25767596/vectorised-haversine-formula-with-a-pandas-dataframe – iamanigeeit Nov 29 '18 at 02:49
  • I'm sorry I didn't explain my question very well. The geo_buffer is actually more like 'Polygon([(0,0),(1,1),(0,1),(-1,0)]].buffer(10)',it's a MultiPolygon. So the calculated distance is not possible.. I'm trying to use geopandas, I need to convert the longitude and latitude into '< shapely. Geometry. The point. The point at 0 x7f3997c54550 > By 'Point (longtitude ,latitude), I have about 1000w GPS points. but the process has spent too much time. – TinyAnt Dec 03 '18 at 08:37
  • The real background is probably a road. The bus travels on the road with a GPS trace. Suppose I want to extend this road to 30 meters on both sides,it is like geo_buf.Then find the gps point on the widened road. – TinyAnt Dec 03 '18 at 08:54