I am trying to fit my data to a negative binomial model. My data is simply an array of numbers
all_hits = [0, 4000, 200, ...]
If I plot the data as a histogram and plot the negative binomial function with some parameters chosen by eye, I get the following:
import matplotlib.pyplot as plt
import scipy.stats as ss
import scipy.optimize as so
import numpy as np
plt.plot(range(0,30000), ss.nbinom.pmf(range(0,30000), n=3, p=1.0/300, loc=0), 'g-')
bins = plt.hist(all_hits, 100, normed=True, alpha=0.8)
However, when I try to use scipy.optimize.fmin
to estimate n, p, and loc, I get a negative log likelihood of infinity
# function from http://stackoverflow.com/questions/23816522/fitting-negative-binomial-in-python
def likelihood_f((n,p,loc), x, neg=-1):
n=np.round(n) #by definition, it should be an integer
loc=np.round(loc)
return neg*(np.log(ss.nbinom.pmf(x, n, p, loc))).sum()
xopt, fopt, iterations, funcalls, warn = so.fmin(likelihood_f, (3,1.0/300, 0), args=(all_hits,-1), full_output=True, disp=False)
print 'optimal solution: r=%f, p=%f, loc=%f' % tuple(xopt)
print 'log likelihood = %f' % (fopt)
print 'ran %d iterations with %d function calls' % (iterations, funcalls)
print 'warning code %d' % (warn)
optimal solution: r=3.000000, p=0.003333, loc=0.000000
log likelihood = inf
ran 121 iterations with 604 function calls
warning code 1
Is there another method for estimating the parameters of a negative binomial distribution? Is there something wrong with my likelihood function or my initial guess that would make the estimate never get better than the guess?