Find initial conditions for nonlinear models using the nlsLM function

Question

I am using the nlsLM function from the minpack.lm package to find the values of parameters a, e, and c that give the best fit to the data out. Here is my code:

n <- seq(0, 70000, by = 1)
TR <- 0.946
b <- 2000
k <- 50000
nr <- 25
na <- 4000
nd <- 3200
d <- 0.05775

y <- d + ((TR*b)/k)*(nr/(na + nd + nr))*n
## summary(y)
out <- data.frame(n = n, y = y)
plot(out$n, out$y)

## Estimate the parameters of a nonlinear model
library(minpack.lm)
k1 <- 50000
k2 <- 5000

fit_r <- nlsLM(y ~ a*(e*n + k1*k2 + c), data=out,
               start=list(a = 2e-10,
                          e = 6e+05, 
                          c = 1e+07), lower = c(0, 0, 0), algorithm="port")
print(fit_r)
## summary(fit_r)

df_fit <- data.frame(n = seq(0, 70000, by = 1))
df_fit$y <- predict(fit_r, newdata = df_fit)
plot(out$n, out$y, type = "l", col = "red", ylim = c(0,10))
lines(df_fit$n, df_fit$y, col="green")
legend(0,ceiling(max(out$y)),legend=c("observed","predicted"), col=c("red","green"), lty=c(1,1), ncol=1)

The fitting to the data seems to be very sensitive to initial conditions. For example:

with list(a = 2e-10, e = 6e+05, c = 1e+07), this gives a good fit:

Nonlinear regression model
  model: y ~ a * (e * n + k1 * k2 + c)
   data: out
        a         e         c 
2.221e-10 5.895e+05 9.996e+06 
 residual sum-of-squares: 3.225e-26

Algorithm "port", convergence message: Relative error between `par' and the solution is at most `ptol'.

with list(a = 2e-01, e = 100, c = 2), this gives a bad fit:

Nonlinear regression model
  model: y ~ a * (e * n + k1 * k2 + c)
   data: out
        a         e         c 
1.839e-08 1.000e+02 0.000e+00 
 residual sum-of-squares: 476410

Algorithm "port", convergence message: Relative error in the sum of squares is at most `ftol'.

So, Is there an efficient way to find initial conditions that give a good fit to the data ?

EDIT:

I added the following code to explain better the problem. The code is used to find the values of a, e, and c that give the best fit to data from several data sets. Each line in Y corresponds with one data set. By running the code, there is an error message for the 3rd data set (or 3rd line in Y): singular gradient matrix at initial parameter estimates. Here is the code:

TR <- 0.946
b <- 2000
k <- 50000
nr <- 25
na <- 4000
nd <- 3200
d <- 0.05775

Y <- data.frame(k1 = c(114000, 72000, 2000, 100000), k2 = c(47356, 30697, 214, 3568), n = c(114000, 72000, 2000, 100000), 
           na = c(3936, 9245, 6834, 2967), nd = c(191, 2409, 2668, 2776), nr = c(57, 36, 1, 50), a = NA, e = NA, c = NA)

## Create a function to round values
roundDown <- function(x) {
  k <- floor(log10(x))
  out <- floor(x*10^(-k))*10^k
  return(out)
}

ID_line_NA <- which(is.na(Y[,c("a")]), arr.ind=TRUE)
## print(ID_line_NA)

for(i in ID_line_NA){

  print(i)

  ## Define the variable y
  seq_n <- seq(0, Y[i, c("n")], by = 1)
  y <- d + (((TR*b)/(Y[i, c("k1")]))*(Y[i, c("nr")]/(Y[i, c("na")] + Y[i, c("nd")] + Y[i, c("nr")])))*seq_n
  ## summary(y)
  out <- data.frame(n = seq_n, y = y)
  ## plot(out$n, out$y)

  ## Build the linear model to find the values of parameters that give the best fit 
  mod <- lm(y ~ n, data = out)
  ## print(mod)

  ## Define initial conditions
  test_a <- roundDown(as.vector(coefficients(mod)[1])/(Y[i, c("k1")]*Y[i, c("k2")]))
  test_e <- as.vector(coefficients(mod)[2])/test_a
  test_c <- (as.vector(coefficients(mod)[1])/test_a) - (Y[i, c("k1")]*Y[i, c("k2")])

  ## Build the nonlinear model
  fit <- tryCatch( nlsLM(y ~ a*(e*n + Y[i, c("k1")]*Y[i, c("k2")] + c), data=out,
                                   start=list(a = test_a,
                                              e = test_e,
                                              c = test_c), lower = c(0, 0, 0)),
                             warning = function(w) return(1), error = function(e) return(2))
  ## print(fit)

  if(is(fit,"nls")){

    ## Plot
    tiff(paste("F:/Sources/Test_", i, ".tiff", sep=""), width = 10, height = 8, units = 'in', res = 300)
    par(mfrow=c(1,2),oma = c(0, 0, 2, 0))
    df_fit <- data.frame(n = seq_n)
    df_fit$y <- predict(fit, newdata = df_fit)
    plot(out$n, out$y, type = "l", col = "red", ylim = c(0, ceiling(max(out$y))))
    lines(df_fit$n, df_fit$y, col="green")
    dev.off()

    ## Add the parameters a, e and c in the data frame
    Y[i, c("a")] <- as.vector(coef(fit)[c("a")])
    Y[i, c("e")] <- as.vector(coef(fit)[c("e")])
    Y[i, c("c")] <- as.vector(coef(fit)[c("c")])

  } else{

    print("Error in the NLM")

  }
}

So, using the constraints a > 0, e > 0, and c > 0, is there an efficient way to find initial conditions for the nlsLM function that give a good fit to the data and to avoid error messages ?

I added some conditions to define initial conditions for the parameters a, e, and c:

Using the result of the linear model lm(y ~ n):

c = intercept/a - k1*k2 > 0 and 
e = slope/a > 0
0 < a < intercept/(k1*k2)

, where intercept and slope is the intercept and slope of lm(y ~ n), respectively.

G. Grothendieck · Accepted Answer · 2018-09-07T13:37:21.783

The problem is not how to find the initial values of the parameters. The problem is that this is a re-parameterized linear model with constraints. The parameters for the linear model are slope a*e and intercept a*(k1 * k2 + c) so there can only be 2 parameters such as slope and intercept but the formula in the quesiton attempts to define three: a, c and e.

We will need to fix one of the variables or, in general, add an additional constraint. Now if co is a vector whose first element is the Intercept and second element is the slope from the linear model

fm <- lm(y ~ n)
co <- coef(fm)

then we have the equations:

co[[1]] = a*e
co[[2]] = a*(k1*k2+c)

co, k1 and k2 are known and if we consider c as fixed then we can solve for a and e to give:

a = co[[2]] / (k1*k2 + c)
e = (k1 * k2 + c) * co[[1]] / co[[2]]

Since both co[[1]] and co[[2]] are positive and c must be too a and e are necessarily positive as well giving us a solution once we arbitrarily fix c. This gives an infinite number of a, e pairs which minimize the model, one for each non-negative value of c. Note that we do not need to invoke nlsLM for this.

For example, for c = 1e-10 we have:

fm <- lm(y ~ n)
co <- coef(fm)

c <- 1e-10
a <- co[[2]] / (k1*k2 + c)
e <- (k1 * k2 + c) * co[[1]] / co[[2]]
a; e
## [1] 5.23737e-13
## [1] 110265261628

Note that numerical problems may exist due to the large difference in magnitude among the coefficients; however, if we increase c that will increase e and decrease a making the scaling even worse so the parameterization of this problem given in the question seems to have inheritently bad numerical scaling.

Note that none of this requires running nlsLM to get the optimum coefficients; however, due to the bad scaling it might still be possible to improve the answer somewhat.

co <- coef(lm(y ~ n))
c <- 1e-10
a <- co[[2]] / (k1*k2 + c)
e <- (k1 * k2 + c) * co[[1]] / co[[2]]
nlsLM(y ~ a * (e * n + k1 * k2 + c), start = list(a = a, e = e), lower = c(0, 0))

which gives:

Nonlinear regression model
  model: y ~ a * (e * n + k1 * k2 + c)
   data: parent.frame()
        a         e 
2.310e-10 5.668e+05 
 residual sum-of-squares: 1.673e-26

Number of iterations to convergence: 12 
Achieved convergence tolerance: 1.49e-08

Thank you very much G. Grothendieck for your answer. It’s great! However, `c` should be > 0. — Pierre, Sep 07 '18 at 04:19
As mentioned c is arbitrary but the larger c is the worse the scaling will be so choose it to be some small number. I have done that above. — G. Grothendieck, Sep 07 '18 at 14:06
If the added constraint is `a < intercept/(k1*k2)` then since `a = co[[2]] / (k1*k2 + c)` it will be satisfied if `co[[2]] / (k1*k2 + c) < co[[1]] / (k1 * k2)`. Now suppose `c = 0` then it is satisfied if `co[[2]] < co[[1]]` and if `c` is greater than 0 then it is still satisfied because then the left hand side is even less. Thus the added constraint is automatically satisfied. — G. Grothendieck, Sep 10 '18 at 15:59
By adding `c <- 1e-10 a <- co[[2]] / (k1*k2 + c) e <- (k1 * k2 + c) * co[[1]] / co[[2]]` in the code given above, I have still cases where there are error messages `singular gradient matrix at initial parameter estimates`. — Pierre, Sep 10 '18 at 17:45
In that case just use the answer we computed algebraically. You don't actually need to run the nonlinear optimization. — G. Grothendieck, Sep 10 '18 at 19:14
In this case, `c = 1e-10` for all cases (i.e., all data sets) ? — Pierre, Sep 10 '18 at 19:51
As long as they are all similar to the one shown that should be ok. — G. Grothendieck, Sep 10 '18 at 20:13

Find initial conditions for nonlinear models using the nlsLM function

1 Answers1