I would just use an apply
function. First, let's make your problem reproducible by creating some "fake" data - I am making the lon/lat pairs artificially close so that we can get a few TRUE
's back in the results:
library(geosphere)
df1 <- data.frame(X1 = sample(letters, 100, replace = T),
x2 = sample(letters, 100, replace = T),
x3 = sample(letters, 100, replace = T),
lon = sample(10:12 + rnorm(100, 0, 0.1), 100, replace = T),
lat = sample(10:12 + rnorm(100, 0, 0.1), replace = T))
df2 <- data.frame(x1 = sample(letters, 100, replace = T),
lon = sample(10:12 + rnorm(100, 0, 0.1), 100, replace = T),
lat = sample(10:12 + rnorm(100, 0, 0.1), 100, replace = T))
We can then create two matrices containing the values of interest:
m1 <- as.matrix(df1[, c("lon", "lat")])
m2 <- as.matrix(df2[1:11, c("lon", "lat")])
Now we can use the apply
function across the rows of m2
which return a 100 X 11
matrix:
results <- apply(m2, 1, FUN = function(x) distCosine(x, m1))
To get the less than 5 mi (~8046.72m), results, we simply subset:
results[results < 8046.72]
# Showing the next two for alternative output
which(results < 8046.72)
which(results < 8046.72, arr.ind = T)
Note: In your question, it looks like you are interested in the first 1,452 rows -- this would mean the results would we be a 1,452 X 11
matrix.