1

The Weibull probability density function is given as such:

enter image description here

and the cumulative hazard function is represented as:

enter image description here

Now, I have a dataset where, the t and H are known as such:

## Import libraries
from scipy.optimize import curve_fit
from scipy import stats
from scipy.optimize import minimize
import numpy as np
import matplotlib.pyplot as plt

## Load the dataset
t = np.array([4,        6,         8,       10])        ## Time data         
H = np.array([0.266919, 0.518181,  0.671102, 1.808351]) ## Cumulative Hazard

I wish to find the shape parameter(ρ) and scale parameter(λ) for the above dataset.

For simplicity of calculation, I have used the below logarithmic formulation:

enter image description here

Hence the logarithm for the time vector and Cumulative Hazard can be written as such:

ln_t = np.log(t)   # log of time vector
ln_H = np.log(H)   # log of Cumulative Hazard

I have applied below two methods

Method-1: By curve fitting using a linear function as such:

## Define a linear function
def ln_cum_hazard(LN_T, rho, myu):
    LN_H = rho * LN_T - myu
    return LN_H

## Model parameters
popt, pcov = curve_fit(ln_cum_hazard, ln_t, ln_H)

## Shape parameter
ρ = popt[0]
print("\n ρ (shape parameter) = \n", ρ)

## Scale parameter
λ = np.exp(popt[1]/popt[0])
print("\n λ (scale parameter) = \n", λ)

## Predicted values of H
H_pred = (t/λ)**ρ

## Plot H
plt.plot(t,H,'r',label = 'Actual Data')
plt.plot(t,H_pred,'b',label = 'Curve fit: scipy.optimize')
plt.legend()
plt.title('ρ = {} and λ = {}'.format(ρ,λ))
plt.xlabel('t')
plt.ylabel('H')
plt.show()

By curve fitting via scipy.optimize, I get the below fitted curve:

enter image description here

Method-2: By maximum likelihood estimation via stats.norm.logpdf()

## Define the linear function
def ln_cumulative_haz(params):
    rho = params[0]               # ρ (rho):    Shape parameter
    myu = params[1]               # λ (lambda): Scale parameter
    sd  = params[2]
      
    ## Linear function
    LN_H = rho * ln_t - myu

    ## Define the negative log likelihood function
    LL = -np.sum( stats.norm.logpdf(ln_H, loc = LN_H, scale=sd ) )

    return(LL)

## Inital parameters
initParams = [1, 1, 1]

## Minimize the negative log likelihood function
results = minimize(ln_cumulative_haz, initParams, method='Nelder-Mead')
  
## Model parameters
estParms = results.x

## Shape parameter
ρ = estParms[0]
print("\n ρ (shape parameter) = \n", ρ)

## Scale parameter
λ = np.exp(estParms[1]/estParms[0])
print("\n λ (scale parameter) = \n", λ)

## Predicted values of H
H_pred = (t/λ)**ρ

## Plot H
plt.plot(t,H,'r',label = 'Actual Data')
plt.plot(t,H_pred,'b',label = 'Curve fit: stats.norm.logpdf()')
plt.legend()
plt.title('ρ = {} and λ = {}'.format(ρ,λ))
plt.xlabel('t')
plt.ylabel('H')
plt.show()

By MLE via stats.norm.logpdf(), I get the below curve fit:

enter image description here

By both the methods, I have almost the same accuracy for ρ(shape parameter) and λ(scale parameter).

Now I have below 3 doubts:

  1. Is the procedure to calculate the ρ and λ parameters correct for both the methods?

  2. To define the negative log-likelihood function I have used "stats.norm.logpdf". However, the underlying function of data is a two parameter weibull distribution. Will this be correct?

  3. Is there any other method to calculate the ρ and λ in Python?

Can somebody please help me out with these doubts?

Edit-1: Considering scipy.stats.weibull_min

I have applied below codes for minimization of negative log likelihood of weibull function.

## Define the linear function
def ln_cumulative_haz_weibull(params):
    rho    = params[0]               # ρ (rho):    Shape parameter
    myu    = params[1]               # λ (lambda): Scale parameter
    shape  = params[2]
    sd     = params[3]
    
   

    ## Linear function
    LN_H = rho * ln_t - myu

    ## Calculate negative log likelihood
    LL = -np.sum( stats.weibull_min.logpdf(x = ln_H,
                                           c = shape,
                                           loc = LN_H,
                                           scale = sd ) )

    return(LL)


## Inital parameters
initParams = [1, 1, 1, 1]   ## params[i]

## Results of MLE
results = minimize(ln_cumulative_haz_weibull, initParams, method='Nelder-Mead')


## Model parameters
estParms = results.x

ρ = estParms[0]
print("\n ρ (shape parameter) = \n", ρ)

λ = np.exp(estParms[1]/estParms[0])
print("\n λ (scale parameter) = \n", λ)


##ln_H_pred = estParms[0] * ln_t - estParms[1]
##print("\n ln_H_pred = \n",ln_H_pred)


H_pred = (t/λ)**ρ
print("\n H_pred = ",H_pred)
## Plot ln(H)
plt.plot(t,H,'r',label = 'Actual Data')
plt.plot(t,H_pred,'b',label = 'Curve fit: stats.weibull_min.logpdf')
plt.legend()
plt.title('ρ = {} and λ = {}'.format(ρ,λ))
plt.xlabel('t')
plt.ylabel('H')
plt.show()

However, the curve fitting results is not good by using stats.weibull_min.logpdf() and appear as such:

enter image description here

Also I get an error which states:

Warning (from warnings module): File "C:\Users\user\AppData\Local\Programs\Python\Python37\lib\site-packages\scipy\optimize\optimize.py", line 597 numpy.max(numpy.abs(fsim[0] - fsim[1:])) <= fatol): RuntimeWarning: invalid value encountered in subtract

Can somebody please help me out where am I going wrong?

NN_Developer
  • 417
  • 6

1 Answers1

0

I agree with your result :

ρ (shape parameter) = 1.914

λ (scale parameter) = 8.3568

The fitting is rather good :

enter image description here

I cannot say what you have drawn. Check it.

JJacquelin
  • 1,529
  • 1
  • 9
  • 11
  • Thanks a lot @JJacquelin for your feedback. By using stats.weibull_min.logpdf() as shown in Edit-1, I am getting ρ = 1.0 and λ = 2.718 which is hot so good. Could you please let me know which curve fit method did you use? Also, the curve limits are t=[4,10]. Could you please let me know how did you interpolate the curve in t=[2,12] – NN_Developer Mar 06 '23 at 05:18
  • 1
    Obviously the iterative process doesn't start correctly The final values remain at the initial values ρ = 1.0 and λ = e^1=2.718 . I cannot say more than the judicious Reinderien's comment. The curve fit method I use is classical non linear regression implemented in an homemade software. The continuous drawing of H(t) between any specified limits is also done with my own application (Built with the progiciel Delphi from Borland). – JJacquelin Mar 06 '23 at 09:03