1

I'm trying to automate my code, which requires solving ODE, over the changes of different parameters.

I have a nested data frame, with 2 columns:

  • PP, which is my "main" variable
  • data, which gathers 14 variables computed from the PP value.

This test_df_nesteddataframe looks like:

PP  | data 
0.0 | 14 variables
0.1 | 14 variables
0.2 | 14 variables

When unnesting the data column, I get:

PP  | u_croiss | kUpeak | kUstable | w_0 |... 
0.0 | 30000    | 560000 | 480000   | 300 |...
0.1 | 42000    | 588000 | 504000   | 420 |...
0.2 | 54000    | 616000 | 528000   | 540 |...

I then want to apply an ODE function to give me the evolution of 2 variables (U and V) over time on every row of my nested data frame. So, here, I'm trying to use the odefunction from the deSolvepackage, and make it run over several sets of parameters using mapfunction from the purrrpackage.

As described in the help of the ode function, I defined my yini with 2 initial values, defined the times and the list of parameters pars.

I then apply the ode function using these variables and stock the result in a out object.

Every step here is embedded in a custom function named make_ODE, that returns out.

To run this custom function over all sets of parameters that I have in my nested df, I use the map function as such:

test <- test_df_nested %>% 
  mutate(outputs = map2(PP,data, ~make_ODE(.x, .y)))

The make_ODE function looks like:


make_ODE <- function(PP,data){

  Unnested_df <- test_df_nested %>% 
    unnest(data)
  
  yini  <- c(V = 1)
  # Set the initial value of V = 1
  
  times <- seq(0, 200, by = 1)
  
  
  # PARS should be one value at a time
  parms  <- c(
    # v_croiss = Unnested_df$u_croiss,
    # v_croiss = c(3000,4000,5000) #Where the problem occurs
    k_V = 100)
  # k_U = kVlow,
  # u_croiss = 2)
  
  actual_ode <- function(yini, times, parms){
    out   <- ode(y = yini,
                 times,
                 equa_diff_sp_test_nest,
                 parms,
                 method = "rk4") %>% 
      as_tibble()
    
    return(out)
    
  }
  
  res <-actual_ode(yini, times, parms)
  return(res)
  
}


Where equa_diff_sp_test_nest is a function defined with:

equa_diff_sp_test_nest <- function(t,y,parms){
  
  V  <- y[1]

  
  with(as.list(c(y, parms)), {
    

    dVdt <- v_croiss * V * (1 - V/k_V)
    

        return(list(c(dVdt)))
  })
}

It works for a 1 row test_df_nested. But for more rows, the code generates an error, telling me that the output of the ode function has more elements than the initial vector y. Indeed, while y has 2 elements, the outputs have 2*nrows(test_df_nested).

I imagine the error comes from my custom function, but I can't understand where.

Edit: I think one of the problem might come from the fact that the parms vector does not change values according to the row of the nested df. For instance, in the example, there is 3 differents values of the u_croiss parameter, each should be used for solving the ODE, at each value of PP.

Another issue is that I have, on one hand, PPdependant parameters, that are stored in a nested df, upon which I apply the map function. On the other hand, I have time dependant parameters which should be computed through the ODE but using parameters from the nested df.

Actually, my question would be: how can I iterate a customed function with map, where parameters of the custom function vary with time.

Thanks in advance.

Edit:

Here's the output of the dput(test_df_nested):

structure(list(PP = c(0, 0.1, 0.2), data = list(structure(list(
    u_croiss = 30000, kUpeak = 560000, kUstable = 480000, w_0 = 300, 
    kWpeak = 9700, kWstable = 4500, k_m = 0.84, k_c = 4.74, chi_M = 9.31204630137287e-09, 
    chi_C = 2.91010985906386e-08, t_low = 105, t_kpeak = 155, 
    t_kstable = 255), row.names = c(NA, -1L), class = c("tbl_df", 
"tbl", "data.frame")), structure(list(u_croiss = 42000, kUpeak = 588000, 
    kUstable = 504000, w_0 = 420, kWpeak = 10185, kWstable = 4725, 
    k_m = 0.956, k_c = 5.409, chi_M = 9.24967155645443e-09, chi_C = 2.90483478901307e-08, 
    t_low = 105, t_kpeak = 152.5, t_kstable = 252.5), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame")), structure(list(
    u_croiss = 54000, kUpeak = 616000, kUstable = 528000, w_0 = 540, 
    kWpeak = 10670, kWstable = 4950, k_m = 1.072, k_c = 6.078, 
    chi_M = 9.19322958436019e-09, chi_C = 2.90004488568525e-08, 
    t_low = 105, t_kpeak = 150, t_kstable = 250), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame")))), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -3L), groups = structure(list(
    PP = c(0, 0.1, 0.2), .rows = structure(list(1L, 2L, 3L), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -3L), .drop = TRUE))
Rachel
  • 13
  • 3
  • Hi Rachel! Welcome to StackOverflow! – Mark Jul 21 '23 at 08:22
  • 1
    In my experience questions with code AND data get solved a LOT faster than ones without, or ones with just one of the two. If you could include your data (`dput(test_df_nested)`), that'd be great! Thanks :-) – Mark Jul 21 '23 at 08:24
  • Hey @Mark! Thanks for the welcoming! I edited the post, hoping it's enough info. Thanks :) – Rachel Jul 21 '23 at 15:36
  • I'm having trouble with your code since I don't know what `kVlow` or `equa_diff_sp_test_nest` are. Can you make a minimal reproducible example (https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? – pgcudahy Jul 24 '23 at 07:17
  • hi @pgcudahy, I edited my question following your request. `kVlow` is a parameter defined in another script, loaded using `source` function. `equa_diff_sp_test_nest` is the function used to determine the evolution the growth rate of my parameter `dVdt`, solved using ODE. Tell me if it's still unclear. Thanks a lot :) – Rachel Jul 24 '23 at 13:16
  • @Rachel, just a quick note: a [recent post](https://stackoverflow.com/a/76837926/3677576) shows a way how to solve your problem with a simpler and more streamlined pipeline that avoids nested lists. – tpetzoldt Aug 05 '23 at 08:50
  • @tpetzoldt Thanks a lot for letting me know, I'll look into that! (After running the code from your more recent post, I'm amazed by how simple the whole process is now). Thanks again for your interest for my question and the answer provided! – Rachel Aug 17 '23 at 23:47

1 Answers1

0

From my point of view, things like this can be made much easier, without the need of nested data frames, nested function definitions and map2. I would also recommend to simplify data and code further, to make it a minimal reproducible "toy"-example. Nevertheless, a main issue was the unnesting step in the make_ode function that re-uses the global test_df_nested data frame instead of the local data variable.

So instead of:

Unnested_df <- test_df_nested %>% 
  unnest(data)

consider to use:

Unnested_df <-  unnest(data, cols=c())

The resulting code can then look like:

library("dplyr")
library("tidyr")
library("purrr")
library("deSolve")

## data omitted for compactness
# test_df_nested <- structure(list(PP = c(0, 0.1, 0.2), data = .....

make_ODE <- function(PP, data){
  
  Unnested_df <-  unnest(data, cols=c())
  
  yini  <- c(V = 1)
  times <- seq(0, 200, by = 1)
  
  # PARS should be one value at a time
  parms  <- c(
    v_croiss = Unnested_df$u_croiss,
    k_V = 100)

  actual_ode <- function(yini, times, parms){
    out   <- ode(y = yini,
                 times,
                 equa_diff_sp_test_nest,
                 parms,
                 method = "rk4") %>% 
      as_tibble()
    return(out)
  }
  res <- actual_ode(yini, times, parms)
  return(res)
  
}

equa_diff_sp_test_nest <- function(t, y, parms){
  V  <- y[1]
  with(as.list(c(y, parms)), {
    dVdt <- v_croiss * V * (1 - V/k_V)
    return(list(c(dVdt)))
  })
}

test <- test_df_nested %>% 
  mutate(outputs = map2(PP, data, ~make_ODE(.x, .y)))

Final Remarks

  • The output data don't make sense yet, but I assume this should work with the full model and sensful parameters.
  • Instead of using nested lists and map2 the same can be done much easier with flat (unnested) dataframes and an apply-function.
  • Instead of rk4 I would recommend to use an odepack solver.

Package deSolve contains two families of solvers, (1) "industry-strength" solvers from the Fortran ODEPACK library, and (2) solvers from the classical Runge-Kutta family, implemented in C.

Except for special cases, ODEPACK solvers (e.g. lsoda or vode) should be preferred. They are faster and numerically more stable. All ODEPACK solvers use an automatic integration step size while the RK family contains algorithms with automatic (e.g. ode45) or fixed step size (e.g. rk4). The classical rk4 is easy to implement and sometimes useful for teaching and debugging, but can be numerically unstable and has "practically unknown" precision.

tpetzoldt
  • 5,338
  • 2
  • 12
  • 29
  • Thanks a lot for your help! It's now working the way I wanted. I should take a look into the `apply` function family, but I thought the `map` and nested dataframe were preferable in this case. Also, I used the `rk4` method because once again, I learned to do it this way. But I don't understand your remark, since the `rk4` is one of the method provided by an odepack solver (in this case, `deSolve`)? – Rachel Jul 27 '23 at 20:38
  • Done! Thanks a lot! – Rachel Jul 31 '23 at 15:18