1

Say I have some example data that containing the location, condition, and cost of equipment:

set.seed(0)

n <- 10

machine_data <- data.frame(c(1:n), sample.int(2, n, replace=TRUE), runif(n, min=50, max=100), runif(n, min=20000, max=100000))
colnames(machine_data) <- c("ID", "Location", "Condition", "ReplaceCost")

I want to determine what course of actions (Do Nothing, Repair, Replace) assigned to a $Location over successive periods. The total number of permutations of these different choices are:

actions <- c("do_nothing", "repair", "replace")
periods <- 3
perms <- gtools::permutations(n=length(actions), r=periods, v=actions, repeats.allowed = T)

So in this case, I have 27 possible permutations of actions for each $Location and I collect these different possibilities into a dataframe:

n <- length(unique((machine_data$Location)))
decisions <- do.call(rbind, replicate(n, perms, simplify=FALSE))
df <- data.frame(rep(c(1:n), each=nrow(perms)), decisions)
action_labels <- paste("Period", c(1:periods))
colnames(df) <- c("Location", action_labels)
df$Improvement <- 0
df$Cost <- 0

The idea is to calculate the relative improvement and cost of each permutation. These values will are then sent to an optimization model to select the best course of action for each $Location given a budget constraint. The function to perform a singular action is:

replace_threshold <- 50
repair_threshold <- 85
perform_action <- function(action, location_data, repair_threshold, replace_threshold)
{
  # repair decisions and resulting condition
  location_data <- location_data %>%
    # repair results in condition of 95; replace results in 100
    mutate(new_cond = ifelse(action == "repair" & Condition <= repair_threshold, 95, 
                      ifelse(action == "replace" & Condition <= replace_threshold, 100, Condition))
           )
  # the total score for a location is a cost-weighted aggregation of individual equipment
  location_score <- sum(location_data$new_cond * location_data$ReplaceCost)/sum(location_data$ReplaceCost)
  
  return(location_score)
}

I can do this row-wise in a loop:

for(i in 1:n)
{
  location_data <- machine_data[machine_data$Location == i,]
  for(r in 1:nrow(df))
  {
    if(df$Location[r] == i)
    {
      action_set <- df[r,2:(length(actions)+1)]
      for(a in 1:length(action_set))
      {
        action <- toString(action_set[a])
        df$Improvement[r] <- perform_action(action, location_data, repair_threshold, replace_threshold)
        # calculate df$Cost here...
        # update machine_data$Condition for next period here...
      }
    }
  }
}

However, this can become slow at larger scaled problems. I'm looking for a way to speed this up. Is it possible to use something like purrr's accumulate() function to execute each of the actions in the action_set in the looping operation above? I'm open to other non-tidyverse approaches, but that was just one option that I've tried to figure out without success.

coolhand
  • 1,876
  • 5
  • 25
  • 46
  • where does the `permutations` function come from? – Mark Jul 19 '23 at 09:22
  • `permutations()` is in the `gtools` package. Post has been edited to reflect this – coolhand Jul 19 '23 at 13:02
  • Your perform_action() function doesn't seem to accumulate any values when run through this loop. It just keeps replacing the `Improvement` variable with each iteration through the different actions, so that the value at the end is just the result of `Period 3` – pgcudahy Jul 20 '23 at 07:15
  • @pgcudahy You're right. That's because the `$Condition` variable isn't updated (it's just a commented action). I'll try to update – coolhand Jul 20 '23 at 13:40

1 Answers1

1

Your example loop doesn't appear to actually accumulate the improvements across each period, instead it just replaces the previous period with the next. So I can't check my answer against your example data, but I think what you want reduce instead of accumulate

Instead of your inner loop

action_set <- df[r,2:(length(actions)+1)]
for(a in 1:length(action_set)){
    action <- toString(action_set[a])
    df$Improvement[r] <- perform_action(action, location_data, repair_threshold, replace_threshold)

Accumulate will give you a vector of each successive calculation

purrr::accumulate(
    #First grab the actions from the row as a vector
    df[1, 2:4] %>% as_vector(),
    #Then pass them to an anonymous function. x contains the accumulated value, while y is the action
    #I've set the location to 1 as if this were one run through the outer loop
    function(x,y) {
    #Set the accumulated value to the prior value plus the calculated action
        x <- x + perform_action(y, machine_data[machine_data$Location == 1,],
            repair_threshold, replace_threshold)
        return(x)
    },
    #Set the initial value of x to 0
    .init = 0)
#>     .init  Period 1  Period 2  Period 3 
#>   0.00000  76.79909 153.59818 230.39728

And you can see it returns a vector of each successive value. What I think you want is just the final value which you get with reduce

purrr::reduce(df[1, 2:4] %>% as_vector(),
    function(x,y) {
        x <- x + perform_action(y, machine_data[machine_data$Location == 1,],
            repair_threshold, replace_threshold)
        return(x)
    }, .init = 0)
#> [1] 230.3973
pgcudahy
  • 1,542
  • 13
  • 36