I want to know how scipy.stats
uses its methods fit
and pdf
. According to the documentation, fit(data, a, loc = 0, scale = 1)
estimates parameters for data
while pdf(x, a, loc=0, scale=1) computes probability density function
. But I couldn't find how fit
and pdf
are actually performed, statistically and mathematically.
I am using the sm.datasets.elnino
data, using the code from tmthydvnprt
import warnings
import numpy as np
import pandas as pd
import scipy.stats as st
import statsmodels as sm
import matplotlib
import matplotlib.pyplot as plt
data = pd.Series(sm.datasets.elnino.load_pandas().data.set_index('YEAR').values.ravel())
y, x = np.histogram(data, bins = 50, density = True)
x = (x + np.roll(x, -1))[:-1] / 2.0
distribution = st.gennorm
params = distribution.fit(data)
arg = params[:-2]
loc = params[-2]
scale = params[-1]
pdf = distribution.pdf(x, loc = loc, scale = scale, *arg)
sse = np.sum(np.power(y - pdf, 2.0))
Using data
, arg
= 4.3836, loc
= 23.2991, scale
= 3.8499.
I want to know what arg
, loc
, and scale
represent and how they are calculated.
Thank you.