In R, I have a dataframe with x,y as lat,long. How do I find which rows get the minimum distance and assign a number in a new column to show this? An simple example below shows the two rows, (5,3) and (5,2), that have a minimum distance and Column C gives them the same number grouping.
Asked
Active
Viewed 127 times
1 Answers
1
I guess you may need distm
from package library(geosphere)
library(geosphere)
xy <- setNames(data.frame(rbind(c(0,0),c(90,90),c(10,10),c(-120,-45))),c("lon","lat"))
d <- distm(xy)
inds <- which(min(d[d>0])==d,arr.ind = TRUE)
out <- cbind(xy,C = NA)
out$C[inds[,"row"]] <- 1
which gives
> out
lon lat C
1 0 0 1
2 90 90 NA
3 10 10 1
4 -120 -45 NA
Dummy data
> dput(xy)
structure(list(lon = c(0, 90, 10, -120), lat = c(0, 90, 10, -45
)), class = "data.frame", row.names = c(NA, -4L))

ThomasIsCoding
- 96,636
- 9
- 24
- 81
-
Once I identify the two associated rows, how would I go about making a column that categorizes those two points? Can I map those results back to the original dataframe? – Kate Oct 20 '20 at 04:08
-
Thank you. From here, how would I iterate that for the rest of the points? So I want to find the next minimum distance and group those points. The part I'm trying to especially understand is, it is also comparing the minimum distance between a point and the newly grouped points . – Kate Oct 20 '20 at 17:27
-
@Kate I guess you can remove the found pairs from the data frame and do the same thing again to find the next pairs with the minimum distance – ThomasIsCoding Oct 20 '20 at 20:31