I'm struggling to speed up the optimization of a double for-loop function. I've already seen this post and others, but couldn't apply successfully the vectorization as proposed by Marek. Any suggestion / correction / advice to improve and hasten my code are welcome.
The aim of the function is to identify key physiological parameters (TT_TIn, b, g) linking meteorological data (TTnl:growing degree days, dDLv: daily variation of daylength) to crop yield (PS_TN).
I have to do this for many cultivars, each having a large dataset. Here is the code for data generation and the function definition.
# Data generation
lev_moy<-runif(300,80,180) # emergence date
rec<-runif(300,190,280) # harvest date
Site<-rep(c("Cot", "Gla"),150)
Year<-rep(c("2007", "2008", "2009"),100)
meteo<-data.frame("Year"=rep(c(rep(2007, 365), rep(2008, 365),rep(2009, 365)),2),
"Site"=c(rep("Cot",365*3), rep("Gla",365*3)),
"dDLv"=rep(c(seq(3,23,0.125),seq(23,-2.25,-0.125),-1.75),6),
"TTnl"=runif(6*365,11.5,14.5))
PS_TN_pl<-jitter(0.00087*900*((rec*13)-900), factor = 2) # yield
# Function
rmse <- function(x) {
TT_TIn<-x[1] # minimum growing degree day to finish the vegetative phase
b<-x[2] # growth parameter
g<-x[3] # critical daily variation in daylength
# (allowing the end of vegatative phase)
TT_TI<<-numeric(length(lev_moy))
PS_TN_est<<-numeric(length(lev_moy))
for (i in 1:length(Site)) {
# for each plant (row) I select the meteo data corresponding to
# the specific year and site
TTv<-c(subset(meteo, Year==Year[i]&Site==Site[i])$TTnl)
dDLv<-c(subset(meteo, Year==Year[i]&Site==Site[i])$dDLv)
for (j in lev_moy[i]:length(dDLv)) {
# for each plant (row) I sum the temperature (TTnl) corresponding to the
# vegetative phase. This period is detremined by a minimum growing degree
# day (TT_TIn) that I'll like to estimate with the optim function
if (sum(TTv[lev_moy[i]:j])>TT_TIn & dDLv[j]<g ) {
TT_TI[i]<-ifelse(j>rec[i],sum(TTv[lev_moy[i]:rec[i]]),
sum(TTv[lev_moy[i]:j]))
break } } }
PS_TN_est<-b*TT_TI*(rec-TT_TI)
error<-PS_TN_est-PS_TN_pl
rmse<-sqrt(mean(error^2))
return(rmse)
}
# Optimisation
optFl<-optim(c(915,0.00057,1,0.05),rmse)