2

Using SciPy, I am trying to reproduce the weibull fit from this question. My fit looks good when I use the genextreme function as follows:

import numpy as np
from scipy.stats import genextreme
import matplotlib.pyplot as plt

data=np.array([37.50,46.79,48.30,46.04,43.40,39.25,38.49,49.51,40.38,36.98,40.00,
               38.49,37.74,47.92,44.53,44.91,44.91,40.00,41.51,47.92,36.98,43.40,
               42.26,41.89,38.87,43.02,39.25,40.38,42.64,36.98,44.15,44.91,43.40,
               49.81,38.87,40.00,52.45,53.13,47.92,52.45,44.91,29.54,27.13,35.60,
               45.34,43.37,54.15,42.77,42.88,44.26,27.14,39.31,24.80,16.62,30.30,
               36.39,28.60,28.53,35.84,31.10,34.55,52.65,48.81,43.42,52.49,38.00,
               38.65,34.54,37.70,38.11,43.05,29.95,32.48,24.63,35.33,41.34])

shape, loc, scale  = genextreme.fit(data)

plt.hist(data, normed=True, bins=np.linspace(15, 55, 9))

x = np.linspace(data.min(), data.max(), 1000)
y = genextreme.pdf(x, shape, loc, scale)
plt.plot(x, y, 'c', linewidth=3)

The parameters are: (0.44693977076022462, 38.283622522613214, 7.9180988170857374). The shape parameter is positive, corresponding to the sign of the shape parameter on the Weibull wikipedia page which as I understand to be equivalent to a negative shape parameter in R?

So it seems genextreme decides by itself whether the distribution is Gumbel, Frechet or Weibull. Here it has chosen Weibull.

Now I am trying to reproduce a similar fit with the weibull_min function. I have tried the following based on this post, but the parameters look very different to what I got with genextreme:

weibull_min.fit(data, floc=0) 

The parameters now are: (6.4633107529634319, 0, 43.247460728065136)

Is the 0 the shape parameter? Surely it should be positive if the distribution is Weibull?

Community
  • 1
  • 1
Oliver Angelil
  • 1,099
  • 15
  • 31
  • shameless plug: paramnormal might help you out here: http://phobson.github.io/paramnormal/tutorial/fitting.html – Paul H Aug 04 '16 at 19:16

1 Answers1

2

The parameters returned by weibull_min.fit() are (shape, loc, scale). loc is the location parameter. (All scipy distributions include a location parameter, even those where a location parameter isn't normally used.) The docstring of weibull_min.fit includes this:

Returns
-------
shape, loc, scale : tuple of floats
    MLEs for any shape statistics, followed by those for location and
    scale.

You used the argument floc=0, so, as expected, the location parameter returned by fit(data, floc=0) is 0.

Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214
  • so the shape parameter I get with weibull_min.fit() is 6.46. That is very different from 0.44 with genextreme. And isn't a scale parameter of 43 quite high? How would one fit a curve to the data using weibull_min.fit()? – Oliver Angelil Aug 04 '16 at 12:08
  • *Regarding the values:* The values returned by `weibull_min.fit(data, floc=0)` match those returned by `fitdistr(mydata, "weibull")` in R pretty closely, as you can see in the linked question. – Warren Weckesser Aug 04 '16 at 12:39