0

I am trying to calculate KL divergence using the entropy function of scipy.

My p is:

array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

and q is:

array([[ 0.05242718,  0.04436347,  0.04130855,  0.04878344,  0.04310538,
         0.02856853,  0.03303122,  0.02517992,  0.08525434,  0.03450324,
         0.14580068,  0.1286993 ,  0.28897473],
       [ 0.65421444,  0.11592199,  0.0642645 ,  0.02989768,  0.01385762,
         0.01756484,  0.01024294,  0.00891479,  0.01140301,  0.00718939,
         0.00938009,  0.01070139,  0.04644726],
       [ 0.65984136,  0.13251236,  0.06345234,  0.02891162,  0.02429709,
         0.02025307,  0.01073064,  0.01170066,  0.00678652,  0.00703361,
         0.00560414,  0.00651137,  0.02236522],
       [ 0.32315928,  0.23900077,  0.05460232,  0.03953635,  0.02901102,
         0.01294443,  0.02372061,  0.02092882,  0.01188251,  0.01377188,
         0.02976672,  0.05854314,  0.14313218],
       [ 0.7717858 ,  0.09692616,  0.03415596,  0.01713088,  0.01108141,
         0.0128005 ,  0.00847301,  0.01049734,  0.0052889 ,  0.00514799,
         0.00442508,  0.00485477,  0.01743218]], dtype=float32)

When I do:

entropy(p[0],q[0])

I am getting the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-201-563ea7d4decf> in <module>()
      4 print('p0:',p[0])
      5 print('q0:',q[0])
----> 6 entropy(p[0],q[0])

/Users/freelancer/anaconda/envs/py35/lib/python3.5/site-packages/matplotlib/mlab.py in entropy(y, bins)
   1570     y = np.zeros((len(x)+2,), x.dtype)
   1571     y[1:-1] = x
-> 1572     dif = np.diff(y)
   1573     up = (dif == 1).nonzero()[0]
   1574     dn = (dif == -1).nonzero()[0]

/Users/freelancer/anaconda/envs/py35/lib/python3.5/site-packages/numpy/lib/function_base.py in histogram(a, bins, range, normed, weights, density)
    781         if (np.diff(bins) < 0).any():
    782             raise ValueError(
--> 783                 'bins must increase monotonically.')
    784 
    785         # Initialize empty histogram

ValueError: bins must increase monotonically.

Why is it?

VeilEclipse
  • 2,766
  • 9
  • 35
  • 53

1 Answers1

3

This works with the example arrays:

import scipy as sp
sp.stats.entropy(p[0], q[0])

Looking at the stack trace in the error massage, it becomes apparent that you did not call scipy's entropy function but matplotlib's entropy, which works differently. Here is the relevant part:

/Users/freelancer/anaconda/envs/py35/lib/python3.5/site-packages/matplotlib/mlab.pyin entropy(y, bins)

MB-F
  • 22,770
  • 4
  • 61
  • 116
  • Thanks for that. I imported both matplotlib and explicitly scipy's entropy. Maybe it got conflicted – VeilEclipse May 05 '17 at 06:13
  • 1
    @VeilEclipse Yes, that can happen. Results may depend on the order of imports and things like that. For this reason it's usually a [bad idea to `import *`](http://stackoverflow.com/q/2386714/3005167) into the global namespace - even if it seems convenient at first. – MB-F May 05 '17 at 06:35