Recursive loop with dplyr::group_by

Question

I have a function for exponential smoothing. I need to apply this for time series by group. In the beginning I need to set fixed initial values, then for each year the function calculates results that depends on the previous year's result (or on initial values if first year).

I have quite a lot data and the speed is the primary concern. So how to do this with dplyr or tidyverse?

The code below works, but just builds on the initialValues.

library(tidyverse)
library(expm)

# Function:

f <- function(L1, L2, L3, L4, L5, A) {
 solve(A) %*% (expm(A) %*% (A %*% initialValues + c(L1, L2, L3, L4, L5)))
}

# Data:

df <- as_tibble(list(year = rep(2000:2002, 2), 
                 id = rep(letters[1:2], 3), 
                 L1 = sample(1:10, 6),     
                 L2 = sample(1:10, 6), 
                 L3 = sample(1:10, 6),        
                 L4 = sample(1:10, 6),        
                 L5 = sample(1:10, 6),
                 A = list(matrix(runif(25, 0, 1), ncol = 5),
                          matrix(runif(25, 0, 1), ncol = 5),
                          matrix(runif(25, 0, 1), ncol = 5),
                          matrix(runif(25, 0, 1), ncol = 5),
                          matrix(runif(25, 0, 1), ncol = 5),
                          matrix(runif(25, 0, 1), ncol = 5)
                          )))

initialValues <- c(5, 5, 6, 8, 9)

# Call:

final <- df %>%
  group_by(id) %>%
  mutate(result = pmap(list(L1, L2, L3, L4, L5, A), f))

The above function f works for the first year but the following year it should be something like:

  solve(A) %*% (expm(A) %*% (A %*% dplyr::lag(result) + c(L1, L2, 
L3, L4, L5)))

OR:

  solve(A) %*% (expm(A) %*% (A %*% result[i - 1] + c(L1, L2, L3, L4, 
L5)))

But result itself cannot be referred this way inside pmap.

EDIT: With helper variables and the conditional case_when in the function, I can refer to the previous value by group's id_nr, but this solution is clumsy. Any better ideas?

f1 <- function(id_nr, L1, L2, L3, L4, L5, A) {
  case_when(id_nr == 1 ~ solve(A) %*% (expm(A) %*% (A %*% initialValues 
+ c(L1, L2, L3, L4, L5))),
        TRUE ~ NA_real_ )
}

f2 <- function(id_nr, L1, L2, L3, L4, L5, A, onebefore) {
  case_when(id_nr == 2 ~ solve(A) %*% (expm(A) %*% (A %*% onebefore + 
c(L1, L2, L3, L4, L5))),
        TRUE ~ NA_real_ )
}

f3 <- function(id_nr, L1, L2, L3, L4, L5, A, onebefore) {
 case_when(id_nr == 3 ~ solve(A) %*% (expm(A) %*% (A %*% onebefore + 
c(L1, L2, L3, L4, L5))),
        TRUE ~ NA_real_ )
}

final <- df %>%
  group_by(id) %>%
  mutate(id_nr = 1:n(),
         result = pmap(list(id_nr, L1, L2, L3, L4, L5, A), f1),
         result2 = pmap(list(id_nr, L1, L2, L3, L4, L5, A, result[1]), f2),
         result3 = pmap(list(id_nr, L1, L2, L3, L4, L5, A, result2[2]), f3)
  ) %>%
   select(year, id, id_nr, result, result2, result3) %>%
   as.data.frame()

Gives:

# year id id_nr                                               result
# 1 2000  a     1  69.99273, 187.46908, 133.68695, 39.14645, 192.07844
# 2 2001  b     1 150.08891, 105.06450, 134.75766, 143.28060, 86.68116
# 3 2002  a     2                                   NA, NA, NA, NA, NA
# 4 2000  b     2                                   NA, NA, NA, NA, NA
# 5 2001  a     3                                   NA, NA, NA, NA, NA
# 6 2002  b     3                                   NA, NA, NA, NA, NA
# result2                                          result3
# 1                               NA, NA, NA, NA, NA                               
#NA, NA, NA, NA, NA
# 2                               NA, NA, NA, NA, NA                               
#NA, NA, NA, NA, NA
# 3 1630.093, 2488.520, 2012.516, 1407.798, 1377.609                               
#NA, NA, NA, NA, NA
# 4 1751.489, 1444.543, 1531.545, 1922.810, 1544.579                             
#NA, NA, NA, NA, NA
# 5                               NA, NA, NA, NA, NA 30153.83, 
#36416.09, 19069.84, 18595.81, 31028.20
# 6                               NA, NA, NA, NA, NA 22072.69, 
#22904.23, 20731.95, 14812.70, 18054.79

(I still need to combine columns result, result2, result3.)

This what you're looking for? https://stackoverflow.com/questions/48868104/recursive-function-using-dplyr?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa — tjebo, Apr 28 '18 at 16:11
Thanks for the pointer. That really is a very similar problem, but purrr::accumulate() and base::Reduce() accept only 1 or 2 arguments. I need to apply purrr::pmap because it takes my list of lists. — hermo, Apr 28 '18 at 17:39

Recursive loop with dplyr::group_by

0 Answers0