For the second question (number of other stores within a certain radius), you can use the geodist
function as you mentioned and then some data management to get your results.
The following commands will work on a small dataset of five stores. It contains the store ID with the lat and long coordinates.
n.stores <- 5
set.seed(1234)
stores <- cbind(sort(sample(1:10, n.stores)),
runif (n.stores, -0.1, 0.1),
runif (n.stores, -0.1, 0.1)); stores
colnames(stores) <- c("store_id", "lat", "long")
stores
# store_id lat long
# [1,] 1 0.033216752 0.08468670
# [2,] 4 0.002850228 -0.04153683
# [3,] 5 0.038718258 0.06745913
# [4,] 6 0.008994967 -0.04275534
# [5,] 10 -0.043453283 -0.04663584
Now calculate the distances between each store and the others, including itself.
d <- geodist (stores[,2:3]) # A 5-by-5 matrix
d
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0.000 14427.8245 2009.804 14416.5478 16899.483
# [2,] 14427.825 0.0000 12752.057 696.1801 5176.953
# [3,] 2009.804 12752.0567 0.000 12686.0604 15625.871
# [4,] 14416.548 696.1801 12686.060 0.0000 5844.660
# [5,] 16899.483 5176.9527 15625.871 5844.6603 0.000
The distances are in metres. Convert this matrix to a data frame to facilitate the next step.
df <- as.data.frame(d)
colnames(df) <- stores[,'store_id']
df$store_A <- as.numeric(colnames(df))
# 1 4 5 6 10 store_A
# 1 0.000 14427.8245 2009.804 14416.5478 16899.483 1
# 2 14427.825 0.0000 12752.057 696.1801 5176.953 4
# 3 2009.804 12752.0567 0.000 12686.0604 15625.871 5
# 4 14416.548 696.1801 12686.060 0.0000 5844.660 6
# 5 16899.483 5176.9527 15625.871 5844.6603 0.000 10
Then you can reshape to long-form and list or count the number of stores within a certain distance of each other (say 10 km).
library(tidyr)
library(dplyr)
df.long <- pivot_longer(df, -store_A,
names_to="store_B",
values_to="distance",
names_transform=as.numeric) %>%
filter(store_A != store_B) # omit self-references
df.long
# A tibble: 10 × 3
# store_A store_B distance
# <dbl> <dbl> <dbl>
# 1 1 4 14428.
# 2 1 5 2010.
# 3 1 6 14417.
# 4 1 10 16899.
# 5 4 1 14428.
# 6 4 5 12752.
# 7 4 6 696.
# 8 4 10 5177.
# 9 5 1 2010.
#10 5 4 12752.
#11 5 6 12686.
#12 5 10 15626.
#13 6 1 14417.
#14 6 4 696.
#15 6 5 12686.
#16 6 10 5845.
#17 10 1 16899.
#18 10 4 5177.
#19 10 5 15626.
#20 10 6 5845.
Now the group counts.
group_by(df.long, store_A) %>%
filter(distance<10000) %>%
print() %>%
summarise(n=n())
# A tibble: 8 × 3
# Groups: store_A [4]
# store_A store_B distance
# <dbl> <dbl> <dbl>
# 1 1 5 2010.
# 2 4 6 696.
# 3 4 10 5177.
# 4 5 1 2010.
# 5 6 4 696.
# 6 6 10 5845.
# 7 10 4 5177.
# 8 10 6 5845.
# A tibble: 5 × 2
# store_A n
# <dbl> <int>
# 1 1 1
# 2 4 2
# 3 5 1
# 4 6 2
# 5 10 2
So, store "1" has 1 store within 10 km, store "4" has 2, store "5" has 1,
and stores "6" and "10" have two each.