0

Through a model I generate an unfortunately small number of data points (7 in total). To them I have to fit a distribution, plot the PDF and CDF from it and from the CDF calculate the x_values at certain probability points (say, 60%, 75%, 95%).

I have developed a way to do this but I feel it is very inelegant and I would appreciate help in finding a more robust solution. Here's what I have:

import numpy as np
import seaborn as sns


x = [0.09, 1.08, -0.42, 0.08, -0.28, -0.65, -0.04]


probability = 0.6

pdf = sns.distplot(x, norm_hist=False, kde=True)
plt.show()

cdf = sns.distplot(x,
             hist_kws=dict(cumulative=True),
             kde_kws=dict(cumulative=True)).get_lines()[0].get_data()
plt.show()

ix = np.where(cdf[1] > probability)
ix = np.array([ix])

print('At %1.f probability the risk premium is approx %0.2f PLN' 
      % (int(probability*100), float(cdf[0].item(ix.item(0)))))

I'd appreciate if anyone could point me in a way of solving my problem in a better manner. Thanks!

Dave R
  • 21
  • 8

1 Answers1

0

Calculating a pdf from only 7 points makes no sense at all; the result you will get has no statistical relevance at all. You definitely have to generate more data points before you can do anything useful with this.

If you have done this, you could also use this code to calculate the pdf.

zimmerrol
  • 4,872
  • 3
  • 22
  • 41