-1

I have experimental data points which I want to fit with my model which has 3 parameters (p, q, and R). Therefore I have three nested for loops to go through all possible combinations of the three parameters and to determine then the best fit (least squares of residuals). This is how I do that (with weighting = 1/exp(y) ) :

tgreparpq <- function(x, p, q, R){exp(-(p + q)*x)*(1 + x*(q + R*p) + x^2*((R*q^2)/2 + R*q*p) + x^3*R^2*q^2*p/2)}

pvalues <- seq(1e-10,1.01,0.01)
qvalues <- seq(1e-10,1.01,0.01)
repair <- seq(1e-10,1.01,0.01)

bestl <- array(dim = c(length(pvalues), length(qvalues)))
sigmapq <- array(dim = c(length(pvalues), length(qvalues)))
bestp <- vector(length = klength)
bestq <- vector(length = klength)
bestr <- array(dim = c(length(pvalues), length(qvalues)))
bestrk <- vector(length = klength)

#number of dose rates = klength
klength = 2

for (k in 1:klength){
  x <- c(0, dat$Dose[dat$Rate%in%rle(dat$Rate)$values[k]])
  SR <- vector(length = length(repair))
  for (i in 1:length(pvalues)){
    for (j in 1:length(qvalues)){
      for (l in 1:length(repair)){

    y <- tgreparpq(x,pvalues[i], qvalues[j], repair[l])
    SR[l] <- sum(((y-c(1,dat$Survival[dat$Rate%in%rle(dat$Rate)$values[k]]))*1/exp(y))^2)

      }
     bestl[i,j] <- which(SR %in% min(SR))
     sigmapq[i,j] <- SR[bestl[i,j]]
     bestr[i,j] <- repair[bestl[i,j]]
    }
  }
  besti <- which(sigmapq == min(sigmapq), arr.ind = TRUE)[1]
  bestp[k] <- pvalues[besti]
  bestj <- which(sigmapq == min(sigmapq), arr.ind = TRUE)[2]
  bestq[k] <- qvalues[bestj]
  bestrk[k] <- bestr[besti,bestj]
}

This is a rather slow process and I know you're not supposed to use that many for loops in r. Hence my question: Is there a better way to determine the fit paramters (i.e. is there a way to replace the for loops)?

Edit: Here is some example data:

dat =

Dose        Survival     Rate

1.9163E+00  6.42870E-01  3.0000E+01    
3.9713E+00  3.68150E-01  3.0000E+01    
5.9857E+00  1.76050E-01  3.0000E+01    
7.9572E+00  8.27670E-02  3.0000E+01    
1.0013E+01  2.01370E-02  3.0000E+01    
1.2015E+01  1.09200E-02  3.0000E+01
1.9683E+00  6.42530E-01  7.6800E+01    
2.9740E+00  4.86220E-01  7.6800E+01    
4.0354E+00  3.09730E-01  7.6800E+01    
5.0276E+00  2.13930E-01  7.6800E+01    
6.0851E+00  1.67200E-01  7.6800E+01    
7.0223E+00  1.04640E-01  7.6800E+01    
8.0531E+00  6.79020E-02  7.6800E+01    
9.0841E+00  3.14080E-02  7.6800E+01    
1.0135E+01  2.12510E-02  7.6800E+01    
1.1176E+01  8.39810E-03  7.6800E+01    
1.2168E+01  4.92070E-03  7.6800E+01    
1.4169E+01  1.53690E-03  7.6800E+01
Fabi
  • 71
  • 1
  • 8
  • 2
    Please provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). No point showing your whole code when there's no `dat` provided to test. – Adam Quek Apr 25 '17 at 08:42
  • What is the function `tgreparpq`? – Adam Quek Apr 25 '17 at 08:54
  • I added it. It's a function which describes survival as a function of dose (with the three parameters). – Fabi Apr 25 '17 at 08:58
  • 2
    So you are actually doing a non-linear fit with you code and scanning all possible values of the parameters to minimize the MSE? Did I get it correctly? You could check this https://stat.ethz.ch/R-manual/R-devel/library/stats/html/nls.html. It may help you maybe... – Umberto Apr 25 '17 at 09:08
  • And something else. As already mentioned if you could provide a data frame with test data and what result you are expecting we could help you in doing it differently. – Umberto Apr 25 '17 at 09:10
  • 2
    What is `klength`? – Aurèle Apr 25 '17 at 09:44
  • klength is the number of different dose rates (I have a fourth loop where I go over the dose rates). With the example data I provided klength would equal 2. I added it above. – Fabi Apr 25 '17 at 09:47
  • https://cran.r-project.org/view=Optimization – jogo Apr 25 '17 at 09:58

2 Answers2

1

Not an answer but I took the liberty to prepare a data frame for others to use with your data. It may speed up answers.

df = data.frame(Dose = c(   
  1.9163E+00, 
  3.9713E+00, 
  5.9857E+00, 
  7.9572E+00, 
  1.0013E+01, 
  1.2015E+01, 
  1.9683E+00, 
  2.9740E+00, 
  4.0354E+00, 
  5.0276E+00, 
  6.0851E+00, 
  7.0223E+00, 
  8.0531E+00, 
  9.0841E+00, 
  1.0135E+01, 
  1.1176E+01, 
  1.2168E+01, 
  1.4169E+01),
  Survival = c(
  6.42870E-01,
  3.68150E-01,
  1.76050E-01,
  8.27670E-02,
  2.01370E-02,
  1.09200E-02,
  6.42530E-01,
  4.86220E-01,
  3.09730E-01,
  2.13930E-01,
  1.67200E-01,
  1.04640E-01,
  6.79020E-02,
  3.14080E-02,
  2.12510E-02,
  8.39810E-03,
  4.92070E-03,
  1.53690E-03),
  Rate = c( 
  3.0000E+01,
  3.0000E+01,
  3.0000E+01,
  3.0000E+01,
  3.0000E+01,
  3.0000E+01,
  7.6800E+01,
  7.6800E+01,
  7.6800E+01,
  7.6800E+01,
  7.6800E+01,
  7.6800E+01,
  7.6800E+01,
  7.6800E+01,
  7.6800E+01,
  7.6800E+01,
  7.6800E+01,
  7.6800E+01))

Hope it may help someone. I am trying with

nls(Survival ~ tgreparpq(Dose, p, q, R), data = df, start = list(p = 0.1, q = 0.1, R = 0.1))

but I am getting a "singular Gradient". I am checking but it maybe that your model has not a unique solution?

Umberto
  • 1,387
  • 1
  • 13
  • 29
  • Thanks. The problem with nls is that I get very different results depending on the starting values I give. This is the reason why I wanted to do it by "hand". – Fabi Apr 25 '17 at 09:34
  • That is the problem of non linear fits when the fitting function is complicated or with too many parameters. Believe me I am working on a project in which we are trying to fit a function with more than 10 parameters and is a pain. If the function has many minima the result you'll get will depend strongly from the initial parameters. You will find the minimum closer to your initial parameters. So is a good idea to know what a good start is. – Umberto Apr 25 '17 at 10:20
0

You can use expand.grid to get a data frame with one row for every possible combination of the variables you want to loop over, and then use apply which is typically much faster.

This code is not a full solution to your above code, but it's a minimal example of the idea...

dat <- read.table(header = T, text = "
Dose        Survival     Rate
1.9163E+00  6.42870E-01  3.0000E+01    
3.9713E+00  3.68150E-01  3.0000E+01    
5.9857E+00  1.76050E-01  3.0000E+01    
7.9572E+00  8.27670E-02  3.0000E+01    
1.0013E+01  2.01370E-02  3.0000E+01    
1.2015E+01  1.09200E-02  3.0000E+01
1.9683E+00  6.42530E-01  7.6800E+01    
2.9740E+00  4.86220E-01  7.6800E+01    
4.0354E+00  3.09730E-01  7.6800E+01    
5.0276E+00  2.13930E-01  7.6800E+01    
6.0851E+00  1.67200E-01  7.6800E+01    
7.0223E+00  1.04640E-01  7.6800E+01    
8.0531E+00  6.79020E-02  7.6800E+01    
9.0841E+00  3.14080E-02  7.6800E+01    
1.0135E+01  2.12510E-02  7.6800E+01    
1.1176E+01  8.39810E-03  7.6800E+01    
1.2168E+01  4.92070E-03  7.6800E+01    
1.4169E+01  1.53690E-03  7.6800E+01
")

tgreparpq <- function(x, p, q, R){exp(-(p + q)*x)*(1 + x*(q + R*p) + x^2*((R*q^2)/2 + R*q*p) + x^3*R^2*q^2*p/2)}

pvalues <- seq(1e-10,1.01,0.01)
qvalues <- seq(1e-10,1.01,0.01)
repair <- seq(1e-10,1.01,0.01)

x <- 1

results <- expand.grid(pvalues, qvalues, repair)
results$y <- apply(results, 1, function(result_n) {
  tgreparpq(x, result_n[1], result_n[2], result_n[3])
})
CJ Yetman
  • 8,373
  • 2
  • 24
  • 56