I have two data frames in which observations are geographic locations defined by a latitude/longitude combination. For each point in df1
I would like to get the closest point in df2
and the associated value. I know how to do that by computing all the possible distances (using e.g. the gdist
function from the Imap
package) and getting the index for the smallest one. But the fact is that it is at best excessively long as df1
has 1,000 rows and df2
some 15 millions.
Do you have an idea of how I could reach my goal without computing all the distances? Maybe there is a way to limit the necessary calculations (for instance using the difference in latitude/longitude values)?
Thanks for helping,
Val
Here's what df1
looks like:
Latitude Longitude
1 56.76342 8.320824
2 54.93165 9.115982
3 55.80685 9.102455
4 57.27000 9.760000
5 56.76342 8.320824
6 56.89333 9.684435
7 56.62804 8.571573
8 56.64850 8.501947
9 55.40596 8.884374
10 54.89786 11.880828
then df2
:
Latitude Longitude Value
1 41.91000 -4.780000 40500
2 41.61063 14.750832 13500
3 41.91000 -4.780000 4500
4 38.70000 -2.350000 28500
5 52.55172 0.088622 1500
6 39.06000 -1.830000 51000
7 41.91000 -4.780000 49500
8 48.00623 -4.389639 12000
9 56.24889 -3.666940 27000
10 42.72000 -3.750000 49500