I know that are plenty of questions about the log-normal in scipy as this, this, this, and this But I still have doubts.
I'm trying to reproduce this example with SciPy, because I can understand the steps, but I'm not able to.
the data is:
from scipy.stats import lognorm, norm
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
x = [20, 22, 25, 30, 60]
fig,ax = plt.subplots(1,1)
sns.kdeplot(x, color='blue',fill=False,ax=ax)
And I want to fit a log-normal:
shape_x, loc_x, scale_x = lognorm.fit(x,floc=0)
print(f'Estimated parameters for log-normal distribution of parameter x:')
print(f'Shape (s) of x: {shape_x}')
print(f'Location (loc) of x: {loc_x}')
print(f'Scale (scale) of x: {scale_x}')
accordingly to other questions in StackOverflow and scipy documentation, the mean and standard deviation should be:
mu_x = np.log(scale_x)
sigma_x = shape_x
print(f'Mean (μ) of x: {mu_x}')
print(f'Standard deviation (σ) of x: {sigma_x}')
Next, I try to create synthetic data with those parameters, to check:
synthetic_data_B = np.random.lognormal(mean=mu_x, sigma=sigma_x, size=len(x))
pdf_x = lognorm.pdf(x, s = shape_x, loc=loc_x, scale=scale_x)
fig,ax = plt.subplots(1,1)
sns.kdeplot(x, color='blue',fill=False,ax=ax)
sns.kdeplot(synthetic_data_B, color='red',fill=False,ax=ax)
ax.plot(x,pdf_x,color='green')
What I realize it:
- The median in the article is the scale parameter from scipy.
- The μ in the article is my mu_x = np.log(scale_x), but the σ is different, in the article is 0.437 and with scipy gives 0.391.
- If I evaluate the mean with lognorm.mean(shape_x,loc_x,scale_x) it gives quite similar to the article.
- If I evaluate the standard deviation with lognorm.std(shape_x,loc_x,scale_x), it gives different.
My questions are:
Why σ is different?
The synthetic data predicted with the fitted parameters do not match the original data, why?
If I try to do the opposite and to recover the x distribution from the fitted parameters, what I got is fairway from what should be.
How can I generate synthetic data to represent the real x?