I have a longitudinal dataset for plant growth that includes repeated measurements in different seasons. I am trying to estimate the parameters for a growth equation using this dataset. The growth equation is as follows:
Lt = 150*((1 + ((150/Lt_1)^(1/p)-1)exp(-k*td/365))^(-p))
Lt represents the plant height at a measurement, Lt_1 represents the plant height at the previous measurement, and td represents the duration of time between measurements. The parameters I want to estimate are k and p. However, I believe that the parameter k may vary depending on the season, so I want to have specific k values for each season. Additionally, I would like to include individual variation as a random effect in the model.
To estimate these parameters, I wrote a code using the optim function, where I assigned each parameter (p, k, I) and calculated the sum of the likelihood. However, the code takes a long time to run and fails to converge. In my actual data, I have a larger number of individuals (n = 100), making the computation even more challenging.
I am seeking advice on the best approach to estimate these parameters efficiently and effectively. Any suggestions or alternative methods would be greatly appreciated.
Data
# generate dummy data
n = 10
id = as.factor(1:n)
season = c("s1","s2","s3","s4")
L0 = rnorm(n,40,5)
td = rnorm(n, 180, 10)
growth <- function(td, Lt_1){
return(function(k,p){
150*((1 + ((150/Lt_1)^(1/p)-1)*exp(-k*td/365))^(-p))
})
}
L1 = growth(Lt_1 = L0, td = td)(p = 1.2, k = 0.55)
L2 = growth(Lt_1 = L1, td = td)(p = 1.2, k = 1.09)
L3 = growth(Lt_1 = L2, td = td)(p = 1.2, k = 0.62)
L4 = growth(Lt_1 = L3, td = td)(p = 1.2, k = 0.97)
df <- data.frame(id = id, L0 = L0, L1 = L1, L2 = L2, L3 = L3, L4 = L4) %>%
pivot_longer(cols = L0:L4, names_to = "Ltype", values_to = "Lobs") %>%
mutate(Lobs = round(Lobs)) %>%
group_by(id) %>%
mutate(Lt_1 = lag(Lobs)) %>%
filter(!is.na(Lt_1)) %>%
mutate(td = round(rnorm(4, 180, 10))) %>%
mutate(season = season) %>%
ungroup() %>%
select(id, season, td, Lt_1, Lobs)
head(df)
# A tibble: 6 x 5
id season td Lt_1 Lobs
<fct> <chr> <dbl> <dbl> <dbl>
1 1 s1 183 36 44
2 1 s2 190 44 65
3 1 s3 193 65 77
4 1 s4 182 77 96
5 2 s1 191 43 53
6 2 s2 177 53 75
Caluculate the sum of likelihood
# define the function to calculate the sum of negative log likelihood
len_id = length(unique(df$id))
sum_negloglike <- function(par){
# parameters to estimate
p = par[1]
sigma = par[2]
k <- par[3:6]
I = par[seq(7, 6+length(unique(df$id)), by = 1)]
# individuals and seasons to loop
indv <- unique(df$id)
season <- unique(df$season)
# define the dataframe
new_df <- data.frame()
# assign k and I
## for individual i
for(i in 1:length(indv)){
indv_i = indv[i]
I_i = I[which(indv == indv_i)]
df_i = df %>%
filter(id == indv_i) %>%
mutate(I = I_i)
# for season j in indvidual i
for (j in 1:length(season)){
season_j = season[j]
k_j = k[which(season == season_j)]
df_i_j = df_i %>%
filter(season == season_j) %>%
mutate(k = k_j) %>%
# estimate the growth
mutate(Lpred = growth(td, Lt_1)(k+I, p))
# add it to the dataframe
new_df <- bind_rows(new_df, df_i_j)
}
}
# calculate the sum of negative log-likelihood
Lobs <- new_df$Lobs
Lpred <- new_df$Lpred
negloglike_sum = sum_negloglike(Lobs)(Lpred, sigma)
return(negloglike_sum)
}
Maximum likelihood estimation
# set initial parameters
p = 1.2
sigma = 1.88
k <- c(0.55, 1.09, 0.62, 0.97)
I = rnorm(len_id, 0, 0.01)
par = c(p, sigma, k, I)
# calculate the sum of likelihood with initial parameter values
sum_negloglike(par)
# estimate the parameters
optim(par, sum_negloglike)