0

The goal is to calculate the distance between all traffic counters along the highway and all the gas stations in Belgium. So I need the distance from every counter to every station. In the dataframe Belgium you can find the longitudinal and lateral distances of counters, in the dataframe Stations you can find those of the gas stations.

For now I used a for loop, this works fine for small dataframes but is very slow for huge ones, which is a characteristic of loops.

Stations1<-Stations[,c("lon","lat")] names(Stations1)<-NULL BELGIUM1<-BELGIUM[,c("lon","lat")] names(BELGIUM1)<-NULL

distancesToStation <- data.frame(matrix(NA,nrow = nrow(Stations),ncol = nrow(BELGIUM)))

     for (i in 1:nrow(BELGIUM)) {
     for (j in 1:nrow(Stations)){
          distancesToStation[j,i] = gmapsdistance(origin = 
                 paste0(Stations1[j,1],"+",Stations1[j,2]),
                 destination =  
                           paste0(BELGIUM1[i,1],"+",BELGIUM1[i,2]),
                 mode = "driving",key = "X")[[2]]/1000 

}}

save(distancesToStation, file = 'DistanceMatrix.Rdata')

This code works perfect for small dataframes, is there a way to speed this up?

  • probably you need `expand.grid` between your two tables then use `mapply` – moodymudskipper May 25 '19 at 13:12
  • I need the distance from each counter to each station, not the distance between the counters itself, so does this also work with expand.grid? – Amon De Keyser May 25 '19 at 13:19
  • it will be easier with `merge` as in Cole's answer below, `expand.grid` works on vectors not tables (sorry) – moodymudskipper May 25 '19 at 13:26
  • Hi Amon. Check out [this link](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) to make a reproducible example. Since we don't have access to your data, it's really tough to help you. You need to try to make your data and code as accessible as possible. Good luck! – Marian Minar May 25 '19 at 17:27

1 Answers1

0

This first generates all of the combos with a cross join merge(..., ..., by = NULL) and then just uses the vectorized approach for gmapsdistance. Note, I don't have an API or anything so that part I couldn't test.

BELGIUM <- data.frame(counters = 1:10
                      , lat = runif(10, 10, 20)
                      , lon = runif(10, 40, 50))

STATIONS <- data.frame(station = LETTERS[1:10]
                       , lat = runif(10, 10, 20)
                       , lon = runif(10, 40, 50))

All_Combos <- merge(BELGIUM, STATIONS, by = NULL)

All_Combos$distancesToStation = gmapsdistance(origin = paste0(All_Combos$lat.y,"+",All_Combos$lon.y),
                                   destination =  paste0(All_Combos$lat.x,"+",All_Combos$lat.x),
                                   mode = "driving",key = "X")[[2]]/1000 
Cole
  • 11,130
  • 1
  • 9
  • 24