0

Dual keys statistics.

In order to deal with data from excels, I tried to use Pandas DataFrame to do statistics. The datasheet structure is something like,

Section Team Bugs Functions month
A dev -1 --- -1------ -Jan----
B dev -2------ -2----- -Jan----
B design -3------ -0------ -Feb----
A design -4------ -3------ -Jan----
A design -4------ -3------ -Sep----
A dev -4------ -3------ -Jul----
B dev -2------ -2----- -Nov----

I need to count the total number of Bugs and Functions for every (section, team) pairs. e.g. based on the data above, for (A, dev), total number of bugs (TNB :)) = 5 (1+4) (A, design),

total number of functions (TNF) = 8 (4+4)

I have tried with the following,

for key, value in df.value_counts().items(): ... however, it just does statistics with single key, which is not the expected result.

or divide original table into serveral sub-tables by 'sections, it may increase the steps.

Any suggestions (algo or API available) to do statics with lower time/space complexity both?

Rui
  • 97
  • 1
  • 8
  • This is not same question, why did system closed it automatically? – Rui Nov 22 '22 at 08:55
  • Got it. It seems recommended solution is correct. Additional question is, how to count the occurency of (section, team) pairs? for example, the occurency of (A, dev) Jan, and (A, dev) Jul is counted as 2 times. – Rui Nov 22 '22 at 09:22

0 Answers0