df1:
id | latlong_tuple |
---|---|
364 | (17.3946820179646, 78.042644262359) |
365 | (17.3945761480423, 78.0427466415321) |
1085 | (17.3950200947952, 78.0432334533569) |
1086 | (17.3947638830589, 78.0430426797909) |
1087 | (17.3945460707558, 78.0430666916614) |
df2
index | latlong_tuple |
---|---|
01 | (17.431952, 78.37396) |
02 | (17.48295, 78.306694) |
03 | (17.479734, 78.34914) |
04 | (17.368366, 78.38604) |
05 | (17.433102, 78.37506) |
def tileId_mapping(sample_cord, tile_cord, tile):
result = []
for i in tqdm(range(0, len(sample_cord))):
dis_list=[]
for j in range(0, len(tile_cord)):
dis = hs.haversine(sample_cord[i], tile_cord[j], unit=Unit.METERS)
dis_list.append(dis)
shortest_dis = min(dis_list)
min_index = dis_list.index(shortest_dis)
result.append(id_tile[min_index])
return result
This code is too slow to when the size of df1 is 320096 and df2 is 5299669. Can someone please help me to make it faster ?
Thanks in advance.
I want the df1.id against each df2.latlong_tuple. This df1.id should be assigned based on the shortest distance from df2.latlong_tuple.
I want result something like below ,
df2
| index | latlong_tuple | Id |
|------ |-----------------------------------|----|
| 01 |(17.431952, 78.37396) |356 |