Is it possible to speed up code by a factor ~1000 using base-R?

Question

The reproducible example below can be used as test case. I am looking for a base-R solution because

I am not experienced with C++ (to integrate Rcpp) or Java (to integrate rJava)
Thus it could be that the "speed-problem" is inherent to the algorithm
I like the reporting tools and the rapid prototyping advantages of R.

Backround according to @Gregor's input:

The original problem is in the field of dial-a-ride problems. Thus we face the question "is it better to merge two tours?" billions of times. The example below has the same structure (How far a two places appart? Which would be the best next location to stop? What is the result if we merge two tours?).
We know about the great answer on how to speed up R-code but in our case we have an additional "business logic" which shows up in various if-else statements. Thus, it seems not possible to us, to achieve vectorized code when if-else statements are present - but may be we miss something.

Reproducible Example

generate_random_sequence <- function(nrows=1000) {
  x <- c(0, runif(nrows, min = 0, max = 100), 0)
  y <- c(0, runif(nrows, min = 0, max = 100), 0)
  loc <- c("Start", floor(runif(nrows, 1000, 9999)), "Start")
  return(data.table::data.table(x = x, y = y, loc = loc, stringsAsFactors = FALSE))
}

dist <- function(x1, y1, x2, y2) {
  return(sqrt((x1-x2)^2 + (y1-y2)^2))
}

get_best_loc <- function(cur_loc, stop1, stop2) {
  d1 <- dist(cur_loc$x, cur_loc$y, stop1$x, stop1$y)
  d2 <- dist(cur_loc$x, cur_loc$y, stop2$x, stop2$y)
  if (d1 <= d2) {
    best_stop <- stop1
    stop_id <- 1L
  } else {
    best_stop <- stop2
    stop_id <- 2L
  }
  return(list(best_stop = best_stop, stop_id = stop_id))
}

combine_sequence <- function(t1, t2) {
  t_combined <- c(1, rep(NA, nrow(t1) + nrow(t2) - 4), 1)
  ind_max <- c(nrow(t1) - 1, nrow(t2) - 1)
  last_stop <- t1[1, ]
  ind_next_stop <- c(2, 2)
  ind_t_combined <- 1
  while (all(ind_next_stop <= ind_max)) {
    r <- get_best_loc(last_stop, 
                        t1[ind_next_stop[1],], t2[ind_next_stop[2],])
    best_stop <- r$best_stop; stop_id <- r$stop_id; rm(r)
    if (stop_id == 1) {
      ind_next_stop[1] <- ind_next_stop[1] + 1
    } else {
      ind_next_stop[2] <- ind_next_stop[2] + 1
    }
    ind_t_combined <- ind_t_combined + 1
    t_combined[ind_t_combined] <- stop_id
    last_stop <- best_stop
  }

  if (ind_next_stop[1] < ind_max[1]) {
    t_combined[(ind_t_combined + 1):(length(t_combined) - 1)] <- 
      t1$loc[ind_next_stop[1]:(nrow(t1) - 1)]
  } else {
    t_combined[(ind_t_combined + 1):(length(t_combined) - 1)] <- 
      t2$loc[ind_next_stop[2]:(nrow(t2) - 1)]
  }
  return(t_combined)
}

n <- 1e2
t1 <- generate_random_sequence(n)
t2 <- generate_random_sequence(n)
microbenchmark::microbenchmark(combine_sequence(t1, t2),
                               times = 10L, unit = "s")

Input from comments:

When I use profiling I get the following output. To me it is not clear, if I have levers where I can obtain a speedup of x1000. (I would just be in the dark)

Code for profiling:

  profvis::profvis({
    t_combined <- c(1, rep(NA, nrow(t1) + nrow(t2) - 4), 1)
    ind_max <- c(nrow(t1) - 1, nrow(t2) - 1)
    last_stop <- t1[1, ]
    ind_next_stop <- c(2, 2)
    ind_t_combined <- 1
    while (all(ind_next_stop <= ind_max)) {
      r <- get_best_loc(last_stop, 
                        t1[ind_next_stop[1],], t2[ind_next_stop[2],])
      best_stop <- r$best_stop; stop_id <- r$stop_id; rm(r)
      if (stop_id == 1) {
        ind_next_stop[1] <- ind_next_stop[1] + 1
      } else {
        ind_next_stop[2] <- ind_next_stop[2] + 1
      }
      ind_t_combined <- ind_t_combined + 1
      t_combined[ind_t_combined] <- stop_id
      last_stop <- best_stop
    }

    if (ind_next_stop[1] < ind_max[1]) {
      t_combined[(ind_t_combined + 1):(length(t_combined) - 1)] <- 
        t1$loc[ind_next_stop[1]:(nrow(t1) - 1)]
    } else {
      t_combined[(ind_t_combined + 1):(length(t_combined) - 1)] <- 
        t2$loc[ind_next_stop[2]:(nrow(t2) - 1)]
    }
})

[See here](http://adv-r.had.co.nz/Profiling.html) for resources on profiling R code. — Gregor Thomas, Aug 07 '18 at 16:05
Also, once you've identified performance bottlenecks, consider using Rcpp or rJava to speed up those parts without needing to port your entire codebase. — Gregor Thomas, Aug 07 '18 at 16:11
One potential problem is `rbind`, c.f. https://stackoverflow.com/questions/50990237/for-loop-in-r-looking-for-an-alternative/50991437#50991437. — Ralf Stubner, Aug 07 '18 at 16:18
It's looking better. I'll re-open, but please also describe your code in words. There are no comments, so it would be nice to have a description of the goal. You profiling shows that lots of time is spent in `get_best_loc`. But looking at the code it's not at all clear to me that what the goal of `combine_sequence` is or why you need a `while` loop running one-at-a-time instead of, say, batching every 1000 together for some vectorization. — Gregor Thomas, Aug 09 '18 at 15:11

Is it possible to speed up code by a factor ~1000 using base-R?

0 Answers0