2

I am new to using Statistical functions in xls. I am able to the KURT function in xls to calculate the Kurtosis or Skewness, given a set of numbers.

But my requirement is to do it in the other way, like for a given Skewness or Kurtosis, is there a way to generate random numbers. Any pointers on how to do that.

The function should take the skewness or Kurtosis value as input, and it should generate 50 random numbers with 1 being minimum and 100,000 being maximum.

If Excel does not have a way, I am looking for suggestions in Python.

Can you please help me how to do this in Excel or Python?

ascripter
  • 5,665
  • 12
  • 45
  • 68
sk123
  • 61
  • 2
  • Have you checked this answer? http://stackoverflow.com/a/19605289/1391441 Also, be aware of the caveats mentioned here: http://stackoverflow.com/q/4807398/1391441 – Gabriel Apr 27 '16 at 19:05
  • Yeah,I checked on that.But I think it is trying to print something,I want the output(meaning numbers) to be printed on the console to start with.Also,can you help me understand where the input Kurtosis in that example? – sk123 Apr 27 '16 at 19:27
  • This is apparently not a trivial exercise and not being an expert I'll probably do more harm than good giving you advises. The only advise I can give you is to take a careful look at the comments in the second link I provided. – Gabriel Apr 27 '16 at 20:27

1 Answers1

0

After experimenting with several distributions, the generalised Gamma distribution seems to be flexible enough to adjust either the skew or the kurtosis to the desired value, but not both at the same time like what was asked in the question @gabriel mentioned in his comment.

So to draw a sample out of a g-Gamma distribution with a single fixed moment, you can use scipy.optimize to find a distribution with minimizes a penalty function (I chose (target - value) ** 2)

from scipy import stats, optimize
import numpy as np

def random_by_moment(moment, value, size):
    """ Draw `size` samples out of a generalised Gamma distribution
    where a given moment has a given value """
    assert moment in 'mvsk', "'{}' invalid moment. Use 'm' for mean,"\
            "'v' for variance, 's' for skew and 'k' for kurtosis".format(moment)
    def gengamma_error(a):
        m, v, s, k = (stats.gengamma.stats(a[0], a[1], moments="mvsk"))
        moments = {'m': m, 'v': v, 's': s, 'k': k}
        return (moments[moment] - value) ** 2    # has its minimum at the desired value      

    a, c = optimize.minimize(gengamma_error, (1, 1)).x    
    return stats.gengamma.rvs(a, c, size=size)

n = random_by_moment('k', 3, 100000)
# test if result is correct
print("mean={}, var={}, skew={}, kurt={}".format(np.mean(n), np.var(n), stats.skew(n), stats.kurtosis(n)))

Before that I came up with a function that matches skew and kurtosis. However even the g-Gamma is not flexible enough to serve this purpose depending on how extreme your conditions are

def random_by_sk(skew, kurt, size):
    def gengamma_error(a):
        s, k = (stats.gengamma.stats(a[0], a[1], moments="sk"))
        return (s - skew) ** 2 + (k - kurt) ** 2  # penalty equally weighted for skew and kurtosis

    a, c = optimize.minimize(gengamma_error, (1, 1)).x    
    return stats.gengamma.rvs(a, c, size=size)

n = random_by_sk(3, 3, 100000)
print("mean={}, var={}, skew={}, kurt={}".format(np.mean(n), np.var(n), stats.skew(n), stats.kurtosis(n)))
# will yield skew ~2 and kurtosis ~3 instead of 3, 3
ascripter
  • 5,665
  • 12
  • 45
  • 68