3

I am trying to figure out why the following code returns different values for the sample's kurtosis:

import pandas
import scipy
e = pandas.DataFrame([1, 2, 3, 4, 5, 4, 3, 2, 1])
print "pandas.rolling_kurt:\n", pandas.rolling_kurt(e, window=9)
print "\nscipy.stats.kurtosis:", scipy.stats.kurtosis(e)

The output I am getting:

pandas.rolling_kurt:
          0
0       NaN
1       NaN
2       NaN
3       NaN
4       NaN
5       NaN
6       NaN
7       NaN
8 -1.060058

scipy.stats.kurtosis: [-1.15653061]

I have tried to play with the pearson vs fisher setting but to no avail.

JohnE
  • 29,156
  • 8
  • 79
  • 109
user32430
  • 65
  • 8
  • 1
    Also applies to `DataFrame.kurtosis()` and `Series.kurtosis()` too. As @JohnE said, scipy.stats uses a bias correction by default. Pandas doesn't even give that option. Seems to be related to the old "standard deviation uses n or (n-1) thing"... a real explanation is here: https://stats.stackexchange.com/questions/84050/definition-of-sample-excess-kurtosis – travc Dec 29 '17 at 22:29

1 Answers1

7

Setting bias=False seems to do it:

In [3]: scipy.stats.kurtosis(e,bias=False)
Out[3]: array([-1.06005831])
JohnE
  • 29,156
  • 8
  • 79
  • 109