i have two dataframes , one has 932060 coordinate point datas and the other has 13205 rectangular polygon data.
PolygonName PolygonCoordinates
0 JZ221 16.509907001328 42.942029002482 16.51009100175...
1 JZ222 16.510091001752 42.960758994106 16.51027300282...
2 JZ248 16.527602997503 42.904377009196 16.52778999695...
3 JZ249 16.527789996959 42.92310700082 16.527975996737...
4 JZ250 16.527975996737 42.941837994914 16.52815999716...
... ... ...
13200 NB484625 31.663816002416 38.701211008476 31.66485300095...
13201 NB484781 31.677563999715 38.616109998867 31.67861600195...
13202 NB484782 31.678616001952 38.637080008588 31.67966399693...
13203 NB484783 31.679663996936 38.658051002215 31.68070900143...
13204 NB484784 31.680709001432 38.679022998312 31.68175099835...
[13205 rows x 2 columns]
point_no Latitude Longitude
0 1 24.673719 46.708474
1 2 24.673720 46.708474
2 3 24.673722 46.708474
3 4 24.673723 46.708474
4 5 24.673724 46.708474
... ... ... ...
932055 932056 24.818875 46.618623
932056 932057 24.818889 46.618653
932057 932058 24.818904 46.618690
932058 932059 24.818919 46.618728
932059 932060 24.818932 46.618768
[932060 rows x 3 columns]
i want to iterate over those two dataframes and append a new PolygonName
column at points dataframe that indicates whether this point contained by which polygon in polygons dataframe:
from shapely.geometry import Polygon,Point
import pandas as pd
polygons = pd.read_excel("polygons.xlsx")
points = pd.read_csv("points.csv")
for polygon_index , polygon_row in polygons.iterrows():
polyString = polygon_row["PolygonCoordinates"]
polyList = polyString.split(" ")
polygonPoint1 = (float(polyList[0]) , float(polyList[1]))
polygonPoint2 = (float(polyList[2]) , float(polyList[3]))
polygonPoint3 = (float(polyList[4]) , float(polyList[5]))
polygonPoint4 = (float(polyList[6]) , float(polyList[7]))
#create shapely Polygon object from coordinates
polygon = Polygon([ polygonPoint1 , polygonPoint2 , polygonPoint3 , polygonPoint4 , polygonPoint1 ])
for point_index , point_row in points.iterrows():
#create shapely Point object from Latitude and Longitude
Point_X = float(point_row["Longitude"])
Point_Y = float(point_row["Latitude"])
point = Point(Point_Y, Point_X)
#check if polygon contains the point
if polygon.contains(point):
points.loc[point_index , "PolygonName"] = polygon_row["PolygonName"]
print(points)
the output is should be like below:
point_no Latitude Longitude PolygonName
0 1 24.673719 46.708474 RH275435
1 2 24.673720 46.708474 RH275435
2 3 24.673722 46.708474 RH275435
3 4 24.673723 46.708474 RH275435
4 5 24.673724 46.708474 RH275435
... ... ... ... ...
932055 932056 24.818875 46.618623 JZ249
932056 932057 24.818889 46.618653 JZ249
932057 932058 24.818904 46.618690 JZ249
932058 932059 24.818919 46.618728 JZ241
932059 932060 24.818932 46.618768 JZ242
this works fine for low number of points but when point count raises , it takes too much time to calculate because of complexity. How can i effectively solve this issue?