1

I have 2 data.frames:

  1. users that contains longitude and latitude of every user
  2. stops that contains longitude and latitude of every bus stop

I would like to calculate % of users that have at least 1 bus stop in specific radius (meters).

So I created a function with nested loop to iterate through each user and break as long as there is 1 stop in desired radius.

The solution works, the problem is performance. Is there a way to speed it up?

percentage_of_users_near_to_busstop <- function(users, stops, radius) {
   users$stops_in_radius <- 0
   users_length <- nrow(users)
   stops_length <- nrow(stops)
   for (i in 1:users_length) {
      for (j in 1:stops_length) {
         if ( distm (c(users$longitude[i], users$latitude[i]), 
                 c(stops$longitude[j], stops$latitude[j]), 
                 fun = distHaversine) < radius)
         { users[i, "stops_in_radius"] <- users[i, "stops_in_radius"] + 1
         break }
      }
      #print(paste0(i/users_length*100, "%"))
   }
   return(nrow(subset(users, stops_in_radius > 0))/nrow(users))
}
David Lexa
  • 189
  • 2
  • 13

0 Answers0