I'm having trouble doing something as relatively simple as:
- Draw N samples from a gaussian with some mean and variance
- Take logs to those N samples
- Fit a lognormal (using stats.lognorm.fit)
- Spit out a nice and smooth lognormal pdf without inf values (using stats.lognorm.pdf)
Here's a small working example of the output I'm getting:
from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
import math
%matplotlib inline
def lognormDrive(mu,variance):
size = 1000
sigma = math.sqrt(variance)
np.random.seed(1)
gaussianData = stats.norm.rvs(loc=mu, scale=sigma, size=size)
logData = np.exp(gaussianData)
shape, loc, scale = stats.lognorm.fit(logData, floc=mu)
return stats.lognorm.pdf(logData, shape, loc, scale)
plt.plot(lognormDrive(37,0.8))
And as you might notice, the plot makes absolutely no sense.
Any ideas?
I've followed these posts: POST1 POST2
Thanks in advance!
Elaboration: I am building a small script that will
- Take raw data and fit a kernel distribution (emperical dist.)
- Assume different distributions given the mean and variance of the data. This would be a gaussian and a lognormal
- Plot those distributions together with the emperical dist using interact
- Calculate the Kullbeck-Leibler divergence between the different distributions when one turns the knob for the mean and variance (and skew eventually)