3

At the moment, I am fitting empirical distributions against theoretical ones as explained in Fitting empirical distribution to theoretical ones with Scipy (Python)?

Using the scipy.stats distributions, the results show a good fit for the hyperbolic secant distribution.

Here's my current approach using some of scipys distributions:

# -*- coding: utf-8 -*-

import pandas as pd
import numpy as np
import scipy.stats
import matplotlib.pyplot as plt


# Sample data with random numbers of hypsecant distribution
data = scipy.stats.hypsecant.rvs(size=8760, loc=1.93, scale=7.19)

# Distributions to check
dist_names = ['gausshyper', 'norm', 'gamma', 'hypsecant']

for dist_name in dist_names:

    dist = getattr(scipy.stats, dist_name)

    # Fit a distribution to the data
    param = dist.fit(data)

    # Plot the histogram
    plt.hist(data, bins=100, normed=True, alpha=0.8, color='g')

    # Plot and save the PDF
    xmin, xmax = plt.xlim()
    x = np.linspace(xmin, xmax, 100)
    p = dist.pdf(x, *param[:-2], loc=param[-2], scale=param[-1])
    plt.plot(x, p, 'k', linewidth=2)
    title = 'Distribution: ' + dist_name
    plt.title(title)
    plt.savefig('fit_' + dist_name + '.png')
    plt.close()

which delivers plots like the following:

enter image description here

But I would like to test the fit against a (generalized) hyperbolic distribution as well since I have the assumptions that it might deliver an even better fit.

Is there a hyperbolic distribution in scipy.stats that I can use? Or is there any workaround?

Using other packages would also be an option.

Thanks in advance!

Community
  • 1
  • 1
Cord Kaldemeyer
  • 6,405
  • 8
  • 51
  • 81
  • Related http://stackoverflow.com/questions/28934454/fitting-hyperbolic-and-harmonic-functions-with-curvefit – Eli Korvigo Jul 27 '16 at 07:36
  • Why the downvote? Of course there are some relations but my question is concerning scipy.stats explicitly. – Cord Kaldemeyer Jul 27 '16 at 07:52
  • In a comment here some asked the same question (implementation of hyperbolic dist. in scipy.stats) but did not get an answer: http://stackoverflow.com/questions/24011209/how-to-normalize-a-histogram-of-an-exponential-distributionin-scipy – Cord Kaldemeyer Jul 27 '16 at 07:55
  • Ah, o.k.. I didn't want to create redundancies. But I added my code and a plot now so that the problem should become clearer! – Cord Kaldemeyer Jul 27 '16 at 08:11

1 Answers1

3

As your distribution is not in scipy.stats you can either add it to the package or try doing things "by hand".

For the former have a look at the source code of the scipy.stats package - it might not be all that much work to add a new distribution!

For the latter option you can use a maximum Likelihood approach. To do so define first a method giving you the pdf of the distribution. Based on the pdf construct a function calculating the log likelihood of the data given specific parameters of the distribution. Finally fit your model to the data by maximizing this log likelihood function using scipy.optimize.

Andi
  • 1,233
  • 11
  • 17