0

I have sorted data with pandas so that I have this dataframe (I work with anaconda, jupyter notebook):

dataframe

I showed a histogram with the abscissa indexing "écart G-D" and ordinate "probabilité".

histogram

I found a topic on stack overflow that deals exactly what I want to do except that it is 7 years old and the code is obsolete! I still tried while correcting some things but it does not work (besides I do not even understand the code) ... Here is the link of the topic: Fitting empirical distribution to theoretical ones with Scipy (Python)?

I would like to graphically test the probability density function that best follows the shape of my histogram. If anyone could enlighten me, it would be great because I'm really in a bind ...

Thank you.

Reblochon Masque
  • 35,405
  • 10
  • 55
  • 80
  • The question is quite vague. One pdf that would fit your data exactly is $f(-13) = 0.004975$, $f(-12) = 0.00995$, ... You need to at least first specify a family of distribution that you would be interested in fitting. – Art Jun 05 '19 at 06:55
  • I would verify with exponantial law, Poisson's law and others laws, my goal is to find which law follows my data... I don't know if it is clear... – Caroline Rebouillat Jun 06 '19 at 09:09
  • Do you specifically exclude data at point 0? Or maybe it is an artifact of data processing that there is a point at -1, at +1, but not at 0? It makes your histogram plot wrong – Severin Pappadeux Jun 06 '19 at 22:21
  • yes you are right, I this problem and I have changed the histogram plot, adding on the the abscissa 0 7 and 11 (the image of these numbers is 0) – Caroline Rebouillat Jun 07 '19 at 07:07

2 Answers2

0

You can fit your data manually by calculating the parameters of a distribution(mean, lambda, etc) and use scipy to generate that distribution. Also, if your main objective is just fit the data to a distribution and then use that distribution later, you can use another software (Stat::Fit) to best fit to your data automatically and plot it on the histogram.

  • But just by analyzing the histogram it looks like a log-normal distribution, check its parameters and try it out. – Luis Guzman Jun 05 '19 at 07:19
0

You can use the distfit library in Python. It will determine the best theoretical distribution for your data.

erdogant
  • 1,544
  • 14
  • 23