Imagine we have 2 dataframes with coordinates ['X','Y']:
df1 :
X Y House №
2531 2016 175
2219 2196 11
2901 3426 201
6901 4431 46
7891 1126 89
df2 :
X Y Delivery office №
2534 2019 O1
6911 4421 O2
2901 3426 O3
7894.5 1120 O4
My idea is to merge them and get:
df3
X Y House № Delivery office №
2531 2016 175 01
2219 2196 11 NA
2901 3426 201 03
6901 4431 46 02
7891 1126 89 04
So we wants to realise 'fuzzy' merge by threshold (this param should be given by user). You can see that house number 11 didn't get any delivery office number because it located too much away from all of presented offices in df2.
So I need all rows from df2 'find' it's closest row from df1 and add it's 'Cost' value to it You can see that usual in-box pd.merge do not work there as well as custom packages that realize fuzzy logic relates to string values using levenshtein distance and so on