I have fit a distribution to my data using scipy.stats.lognormal
, and now I am trying to plot the distribution. I have generated the fit to my data with seaborn:
ax = sns.distplot(1 - clint_unique_cov_filter['Identity'], kde=False, hist=True,
norm_hist=True, fit=lognorm, bins=np.linspace(0, 1, 500))
ax.set_xlim(0, 0.1)
Which gets me the fit I expect:
I need to use the parameters of this distribution for further analysis, but first I wanted to verify I understood the terms. This post shows me that I want to do the following transformations to turn the output of lognorm.fit
to get the standard mu and sigma parameters for a lognormal:
shape, loc, scale = lognorm.fit(1 - clint_unique_cov_filter['Identity'])
mu = np.log(scale)
sigma = shape
But when I try to plot this, I do not get the distribution I expect. To double check, I tried just sticking the original values back into a plot, but the distribution is noticeably different:
s, l, sc = lognorm.fit(1 - clint_unique_cov_filter['Identity'])
rv = lognorm(s, l, sc)
plt.plot(np.linspace(0, 0.1), rv.pdf(np.exp(np.linspace(0, 0.1))))
Why is this distribution not the same as the one seaborn produces?
EDIT:
Reading the seaborn code led me to my answer:
params = lognorm.fit(1 - clint_unique_cov_filter['Identity'])
xvals = np.linspace(0, 0.1)
pdf = lambda x: lognorm.pdf(xvals, *params)
yvals = pdf(xvals)
plt.plot(xvals, yvals)
This provides the correct plot: