-2

In R, I have a dataframe with x,y as lat,long. How do I find which rows get the minimum distance and assign a number in a new column to show this? An simple example below shows the two rows, (5,3) and (5,2), that have a minimum distance and Column C gives them the same number grouping.

df

Kate
  • 35
  • 6

1 Answers1

1

I guess you may need distm from package library(geosphere)

library(geosphere)
xy <- setNames(data.frame(rbind(c(0,0),c(90,90),c(10,10),c(-120,-45))),c("lon","lat"))
d <- distm(xy)
inds <- which(min(d[d>0])==d,arr.ind = TRUE)
out <- cbind(xy,C = NA)
out$C[inds[,"row"]] <- 1

which gives

> out
   lon lat  C
1    0   0  1
2   90  90 NA
3   10  10  1
4 -120 -45 NA

Dummy data

> dput(xy)
structure(list(lon = c(0, 90, 10, -120), lat = c(0, 90, 10, -45
)), class = "data.frame", row.names = c(NA, -4L))
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
  • Once I identify the two associated rows, how would I go about making a column that categorizes those two points? Can I map those results back to the original dataframe? – Kate Oct 20 '20 at 04:08
  • Thank you. From here, how would I iterate that for the rest of the points? So I want to find the next minimum distance and group those points. The part I'm trying to especially understand is, it is also comparing the minimum distance between a point and the newly grouped points . – Kate Oct 20 '20 at 17:27
  • @Kate I guess you can remove the found pairs from the data frame and do the same thing again to find the next pairs with the minimum distance – ThomasIsCoding Oct 20 '20 at 20:31