2

I would like to add an additional column to an existing dataframe by mutating the optimized result from the OPTIM() function. The code works when I strip the dataframe down to 1 row, but gives the following error when there are 2 rows:

Caused by error in optim(): ! objective function in optim evaluates to length 2 not 1


library(tidyverse)
library(dbplyr)


hfcn <- function(b, D, U, U2, U3){
            Din = ((1 - D)^-b - 1) / b
            rsd_2 = ((U / (1 + b*Din*2)^(1/b) - U2)^2)^0.5
            rsd_3 = ((U / (1 + b*Din*3)^(1/b) - U3)^2)^0.5
            rst_tot = rsd_2 + rsd_3
            return(rst_tot)
          }

#################################
####   If I create dataframe A with a single row, the code works, but fails when there are 2 rows ####

# A <- data.frame(ID = c("A1")
#                 , U = c(28844)
#                 , D = c(0.7941582)
#                 , U2 = c(3417)
#                 , U3 = c(2465)
#                 )
#################################

A <- data.frame(ID = c("A1", "A2")
                , U = c(72625, 28844)
                , D = c(0.7785440, 0.7941582)
                , U2 = c(7916, 3417)
                , U3 = c(5409, 2465)
                )



A2 <- mutate(A
             , C = optim(par = 1.1
                          , hfcn
                          , D = A$D
                          , U = A$U
                          , U2 = A$U2
                          , U3 = A$U3
                          , method = "BFGS"
                          #, method = "L-BFGS-B"#, lower = 0, upper = 3
                        )[1]
             )

Error in mutate(): ! Problem while computing C = ...[]. Caused by error in optim(): ! objective function in optim evaluates to length 2 not 1 Backtrace:

  1. dplyr::mutate(...)
  2. stats::optim(...)

I have successfully run the code when I limit the database to a single row. I can run both rows independently through the function and mutate and additional column using OPTIM(), so the code and function inputs check out. I suspect I may have to use a function from purrr but I cannot find anything online that helps me with this problem.

1 Answers1

2

Short answer

You need to pass these values row-wise to optim and not the entire vector:

A2 <- A %>% 
  rowwise() %>% 
  mutate(C = optim(par = 1.1
                   , hfcn
                   , D = D
                   , U = U
                   , U2 = U2
                   , U3 = U3
                   , method = "BFGS"
                   #, method = "L-BFGS-B"#, lower = 0, upper = 3
  )[1]
  ) %>% 
  ungroup()

Note, in the tidyverse when you reference a variable you do not need to use $, as references are understood to be from the data frame, unless specified otherwise.

Long answer

When you pass your function a vector of arguments it will return a value for each set of constants (so your function is vectorized):

hfcn(b = c(.5, .5), 
     U = c(72625, 28844), 
     D = c(0.7785440, 0.7941582), 
     U2 = c(7916, 3417), 
     U3 = c(5409, 2465))
[1] 2654.797 2043.005

However you are trying to find an optima. So you need to set those values to be constant one set at a time:

optim(par = 1.1, 
      hfcn, 
      D = 0.7785440,
      U = 72625,
      U2 = 7916, 
      U3 = 5409, 
      method = "BFGS"
)

You are passing the arguments as a vector, but optim is not vectorized. It will not find the optima for each set of constants (e.g., hfcn(U = 72625, D = 0.7785440, U2 = 7916, U3 = 5409) then hfcn(U = 28844, D = 0.7941582, U2 = 3417, U3 = 2465))-- this is what the error is saying:

optim(par = 1.1, 
      hfcn, 
      D = c(0.7785440, 0.7941582),
      U = c(72625, 28844),
      U2 = c(7916, 3417), 
      U3 = c(5409, 2465), 
      method = "BFGS"
)
Error in optim(par = 1.1, hfcn, D = c(0.778544, 0.7941582), U = c(72625,  : 
  objective function in optim evaluates to length 2 not 1

This is likely why it worked when you only did one row of data. If you had three rows you would get the same error but it would say evaluates to length 3 not 1.

Additional notes

The above code returns C as list-column, which is not usually what is expected. So you might try:

A2 <- A %>% 
  rowwise() %>% 
  mutate(C = unlist(optim(par = 1.1, 
                          hfcn,
                          D = D, 
                          U = U, 
                          U2 = U2,
                          U3 = U3, 
                          method = "BFGS")[1])
  ) %>% 
  ungroup()

  ID        U     D    U2    U3     C
  <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A1    72625 0.779  7916  5409 0.805
2 A2    28844 0.794  3417  2465 1.11 

Also depending on the size of your data set, rowwise is very inefficient so it could take some time to run. This is normally combatted by transforming your data into a long-format and not using rowwise.

LMc
  • 12,577
  • 3
  • 31
  • 43
  • 2
    LMc, Thank you for answer my query correctly and quickly! I have been struggling with implementing functions like uniroot(), optim(), and nls() for a while now; this particular problem has had me frustrated and stumped for 3 days or more. A simple 12 character code modification was all it took, thanks again! – Mel Sorrell Apr 05 '23 at 18:34
  • 1
    Thanks again LMc, I voted up and accepted the excellent explanation. Thanks! – Mel Sorrell Apr 06 '23 at 18:03