Pandas group by column and count values

Question

I have a dataframe:

date        code     result  
2020-01-01  2069.0   Negative
2020-01-29  2069.0   Negative
2020-02-06  2069.0   Positive
2020-02-06  2070.0   Negative
2020-02-07  2070.0   Positive

Grouping by code, I want to find how many results = 'Positive', and how many results = 'Positive' AND 'Negative'. I'm quite new to pandas so I'm quite confused with all the functions that are available.

Thanks!

Try this: `df.groupby(['code', 'result']).count()` – Mayank Porwal May 28 '20 at 15:39 — Mayank Porwal, May 28 '20 at 15:39

score 0 · Accepted Answer · answered May 28 '20 at 15:43

0

You can try groupby.agg:

d = dict(zip(['sum','count'],['Positive','Both']))
(df['result'].eq('Positive').view('i1').groupby(df['code']).
agg(['sum','count']).rename(columns=d))

        Positive  Both
code                  
2069.0         1     3
2070.0         1     2

answered May 28 '20 at 15:43

anky

74,114
11
41
70

Thanks! Would this also count NaN values in the Both column? – Jenny Char May 28 '20 at 16:02
@JennyChar no. for that use `size` instead of `count` [What is the difference between size and count in pandas?](https://stackoverflow.com/questions/33346591/what-is-the-difference-between-size-and-count-in-pandas) – anky May 28 '20 at 16:03
Oh no thats great, I was just wondering because I get a higher count for 'both' results than when I use `df.groupby(['code', 'result']).count()`, so now I'm not sure which method is accurate. Do you know why that might be the case? – Jenny Char May 28 '20 at 16:08
when you groupby `['code', 'result']` it returns unique roes considering both `['code', 'result']` whereas when you groupby `code` it aggregates based on only the code column, your question says you want to groupby column code – anky May 28 '20 at 16:10

Pandas group by column and count values

1 Answers1