Python Skewness and Kurtosis in Naive Bayes classifier

Question

I am creating a Naive Bayes classifier in Python that will be able to guess which month it is based on some weather data of a single day.

Currently the mean and standard deviation are used to classify the month, however I figured that adding skewness and kurtosis might help in improving the accuracy.

I am currently using scipy.stats.norm.cdf to calculate the chance, but I cannot seem to find any cdf function in Python that takes skewness and kurtosis into account.

I feel like I might not be understanding skewness and kurtosis correctly. Skewness and kurtosis have an impact on the cdf function and therefore I expected them to be given as a parameter.

Is there something fundamentally wrong with my understanding of skewness, kurtosis and the cdf function? If not, then where can I find an implementation of the cdf function in Python that takes all these parameters into account?

It might not solve your problem, but take a look at: http://scikit-learn.org/stable/modules/naive_bayes.html — Dietrich, Nov 27 '15 at 22:05
In a normal distribution skewness and kurtosis are both zero and therefore you will have to use a different kind of distribution if you want to somehow define it from these parameters. — Leandro Caniglia, Nov 28 '15 at 12:28

score 2 · Accepted Answer · answered Nov 27 '15 at 21:58

Normal distribution, which you use (scipy.stats.norm) and which is typicaly used to model one-dimensional conditional distribution in Naive Bayes is explicitly defined by just two parameters - its mean and std. There is no point in specifing skewness/kurtosis as they are constant for your distribution (in particular kurtosis is 3).

What you are thinking about is probably a Pearson distribution, which is used to fit more moments (mean, std, skewness and kurtosis).

http://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.stats.pearson3.html

Python Skewness and Kurtosis in Naive Bayes classifier

1 Answers1