0

My data is similar to the following:

# A tibble: 7 x 3
# Groups:   offense [7]
    lon   lat offense           
  <dbl> <dbl> <fct>             
1 -95.3  29.8 aggravated assault
2 -95.4  29.9 auto theft        
3 -95.3  29.8 burglary          
4 -95.5  29.7 murder            
5 -95.4  30.0 rape              
6 -95.5  29.8 robbery           
7 -95.4  29.8 theft  

I can run the following

cbind(df, X = rowSums(distm(df[,1:2], fun = distHaversine) / 1000 <= 10))

# A tibble: 7 x 4
# Groups:   offense [7]
    lon   lat offense                X
  <dbl> <dbl> <fct>              <dbl>
1 -95.3  29.8 aggravated assault     3
2 -95.4  29.9 auto theft             2
3 -95.3  29.8 burglary               3
4 -95.5  29.7 murder                 1
5 -95.4  30.0 rape                   2
6 -95.5  29.8 robbery                1
7 -95.4  29.8 theft                  3

Which gives me the number of points within a radius of 10km according to this SO post.

What I would like to know is how to modify that function to give me which rows correspond to each point within a radius. The first row has a value of 3, this value might be made up of an observation from rows 2, 4 and 7 for example.

It might look like:

    lon   lat offense                X    points

1 -95.3  29.8 aggravated assault     3   c(2,4,7)
2 -95.4  29.9 auto theft             2   c(2,3)
3 -95.3  29.8 burglary               3   c(4,5,7)

Once I have these I would like to create lists such that row 1 would be a list containing lists of 1, 2, 4 and 7. (However this might be a different quesiton)

Data:

library(geosphere)
library(ggmap)
df <- crime %>% 
  group_by(offense) %>% 
  sample_n(1) %>% 
  select(lon, lat, offense)
user113156
  • 6,761
  • 5
  • 35
  • 81
  • 1
    Is the `crime` data from `geosphere` – akrun Nov 01 '19 at 16:03
  • Sorry, its from `library(ggmap)` – user113156 Nov 01 '19 at 16:03
  • 1
    The values are not reproducible as I get different values – akrun Nov 01 '19 at 16:07
  • 1
    Perhaps `m1 <- which(distm(df[,1:2], fun = distHaversine) / 1000 <= 10, arr.ind = TRUE); split(m1[, 'row'], m1[, 'col'])` – akrun Nov 01 '19 at 16:11
  • Thats probably because of the `sample_n(1)`. In my actual data each row is a unique point, if I did not `group_by(offense)` I would get `robbery` maybe 3 times, I guess I could have used `head(1)` or something. – user113156 Nov 01 '19 at 16:11
  • This looks good, thanks! how can I 1) paste these values into a `df` column as `c(1,2,3)...` and 2) extract the information from the rows from which the `split(m1...` out comes from and put the data into lists. – user113156 Nov 01 '19 at 16:16
  • 1
    You can assign a `list` column with `df$points <- split(m1[, 'row'], m1[, 'col'])` but as the values are different, couldn't test – akrun Nov 01 '19 at 16:17

0 Answers0