My use case is to evaluate Poisson pmf on all points which is less than say, 10, and I would call such function multiple of times with difference lambdas. The lambdas are not known ahead of time so I cannot vectorize lambdas.
I heard from somewhere about a secret trick which is to use _pmf
. What is the downside to do so? But still, it is a bit slow, is there any way to improve it without rewriting the pmf in C from scratch?
%timeit scipy.stats.poisson.pmf(np.arange(0,10),3.3)
%timeit scipy.stats.poisson._pmf(np.arange(0,10),3.3)
a = np.arange(0,10)
%timeit scipy.stats.poisson._pmf(a,3.3)
10000 loops, best of 3: 94.5 µs per loop
100000 loops, best of 3: 15.2 µs per loop
100000 loops, best of 3: 13.7 µs per loop
Update
Ok, simply I was just too lazy to write in cython. I had expected there is a faster solution for all discrete distribution that can be evaluated sequentially (iteratively) for consecutive x
. E.g. P(X=3) = P(X=2) * lambda / 3 if X ~ Pois(lambda)
I have less faith in Scipy and Python now. The library function isn't as advanced as what I had expected.