I have a data set which contains values from 0 to 1e-5. I guess the data can be described by lognormal distribution. So I use scipy.stats.lognorm to fit my data and want to plot the origin data and the fitting distribution on a same figure by using matplotlib.
Firstly, I plot the sample by histogram:
Then, I add the fitting distribution by line plot. However, this will change the Y-axis to a very large number:
So the origin data (sample) cannot be seen on the figure!
I've check all variables and I found that the variable pdf_fitted
is so large (>1e7). I really don't understand why a simple fit scistats.lognorm.fit
to a sample that was generated by the same distribution scistats.lognorm.pdf
doesn't work. Here is the codes to demonstrate my problem:
from matplotlib import pyplot as plt
from scipy import stats as scistats
import numpy as np
# generate a sample for x between 0 and 1e-5
x = np.linspace(0, 1e-5, num=1000)
y = scistats.lognorm.pdf(x, 3, loc=0, scale=np.exp(10))
h = plt.hist(y, bins=40) # plot the sample by histogram
# plt.show()
# fit the sample by using Log Normal distribution
param = scistats.lognorm.fit(y)
print("Log-normal distribution parameters : ", param)
pdf_fitted = scistats.lognorm.pdf(
x, *param[:-2], loc=param[-2], scale=param[-1])
plt.plot(x, pdf_fitted, label="Fitted Lognormal distribution")
plt.ticklabel_format(style='sci', scilimits=(-3, 4), axis='x')
plt.legend()
plt.show()