Question on how to do statistics with a dual keys (pairs) data frame?

Asked Nov 22 '22 at 08:49

Active Nov 22 '22 at 08:59

Viewed 15 times

Dual keys statistics.

In order to deal with data from excels, I tried to use Pandas DataFrame to do statistics. The datasheet structure is something like,

Section	Team	Bugs	Functions	month
A	dev	-1 ---	-1------	-Jan----
B	dev	-2------	-2-----	-Jan----
B	design	-3------	-0------	-Feb----
A	design	-4------	-3------	-Jan----
A	design	-4------	-3------	-Sep----
A	dev	-4------	-3------	-Jul----
B	dev	-2------	-2-----	-Nov----

I need to count the total number of Bugs and Functions for every (section, team) pairs. e.g. based on the data above, for (A, dev), total number of bugs (TNB :)) = 5 (1+4) (A, design),

total number of functions (TNF) = 8 (4+4)

I have tried with the following,

for key, value in df.value_counts().items(): ... however, it just does statistics with single key, which is not the expected result.

or divide original table into serveral sub-tables by 'sections, it may increase the steps.

Any suggestions (algo or API available) to do statics with lower time/space complexity both?

edited Nov 22 '22 at 08:59

asked Nov 22 '22 at 08:49

Rui

This is not same question, why did system closed it automatically? – Rui Nov 22 '22 at 08:55
Got it. It seems recommended solution is correct. Additional question is, how to count the occurency of (section, team) pairs? for example, the occurency of (A, dev) Jan, and (A, dev) Jul is counted as 2 times. – Rui Nov 22 '22 at 09:22

Question on how to do statistics with a dual keys (pairs) data frame?

0 Answers0