I have a frequency table of test scores:
score count
----- -----
77 1105
78 940
79 1222
80 4339
etc
I want to show basic statistics and a boxplot for the sample which is summarized by the frequency table. (For example, the mean of the above example is 79.16 and the median is 80.)
Is there a way to do this in Pandas? All the examples I have seen assume a table of individual cases.
I suppose I could generate a list of individual scores, like this --
In [2]: s = pd.Series([77] * 1105 + [78] * 940 + [79] * 1222 + [80] * 4339)
In [3]: s.describe()
Out[3]:
count 7606.000000
mean 79.156324
std 1.118439
min 77.000000
25% 78.000000
50% 80.000000
75% 80.000000
max 80.000000
dtype: float64
-- but I am hoping to avoid that; total frequencies in the real non-toy dataset are well up in the billions.
Any help appreciated.
(I think this is a different question from Using describe() with weighted data, which is about applying weights to individual cases.)