1

I'm working with the pandas library in Python. Suppose I have four random samples drawn from normal distributions in the following way:

np.random.seed(12345)

df = pd.DataFrame([np.random.normal(32000,20000,3650), 
                   np.random.normal(43000,10000,3650), 
                   np.random.normal(43500,14000,3650), 
                   np.random.normal(48000,7000,3650)], 
                  index=[1992,1993,1994,1995])

I want to get 95% confidence intervals for each of these samples so I calculate:

mean_value=df.mean(axis=1)
std_value=df.std(axis=1,ddof=0)
lower_bound=mean_value-1.96*std_value
upper_bound=mean_value+1.96*std_value
diff = upper_bound-lower_bound

For each confidence interval, I want to cut it into 11 equally spaced intervals. I had an idea like the following:

low=lower_bound.values[1]
high=upper_bound.values[1]
diff=0.09*diff.values[1]
np.arange(low,high,diff)

This doesn't quite work, as the cut interval doesn't end on the upper end of the confidence interval. How can I get equally spaced intervals?

user21359
  • 466
  • 5
  • 18

1 Answers1

2

I'm not exactly sure what you are desiring, but it's quite easy to get equally spaced intervals with NumPy's linspace function. Here are the 11 intervals for the first distribution.

np.linspace(lower_bound.values[0], upper_bound.values[0], 12)
array([ -7.18705879e+03,  -3.82825067e+01,   7.11049377e+03,
         1.42592701e+04,   2.14080463e+04,   2.85568226e+04,
         3.57055989e+04,   4.28543752e+04,   5.00031514e+04,
         5.71519277e+04,   6.43007040e+04,   7.14494803e+04])
Ted Petrou
  • 59,042
  • 19
  • 131
  • 136