Pandas how groupby and then count the number of unique items in each column?

Question

Using this code:

import pandas

df = pandas.DataFrame()

df['id']     = list('aaabbbcccdddeee')
df['gender'] = list('mmfmfmmffmfmfff')

I get a DataFrame like:

   id gender
0   a      m
1   a      m
2   a      f
3   b      m
4   b      f
5   b      m
6   c      m
7   c      f
8   c      f
9   d      m
10  d      f
11  d      m
12  e      f
13  e      f
14  e      f

How can I split df by id and then count the number of m and f in each category so that I can get a result that looks something like:

Easiest; `df.groupby(['id', 'gender']).gender.count().unstack(fill_value=0)`... if you need more performance, try `pivot` or `crosstab`. — cs95, May 18 '18 at 21:01
I feel like if it's a duplicate you should link to the duplicated version. Thanks for your answer. — user1367204, May 18 '18 at 21:06
I did, the linked answer is here https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe — cs95, May 18 '18 at 21:13
Don't worry... it isn't a bad thing to close as duplicate. Also, you're welcome! — cs95, May 18 '18 at 21:14

jpp · Answer 1 · 2018-05-18T21:18:15.027

0

You can use pd.pivot_table with len:

res = df.pivot_table(index='id', columns='gender',
                     aggfunc=len, fill_value=0)

print(res)

gender  f  m
id          
a       1  2
b       1  2
c       2  1
d       1  2
e       3  0

edited May 18 '18 at 21:18

answered May 18 '18 at 21:09

jpp

159,742
34
281
339

I don't hold the belief that a difference in agg functions is a cause to answer the question despite it being a duplicate in every other respect _anyway_, but I'll defer to your better judgement because I respect your opinion. – cs95 May 18 '18 at 21:21
@coldspeed, It's an interesting question I raised [here](https://meta.stackoverflow.com/questions/365180/duplicates-in-the-context-of-data-manipulation). Unfortunately, no answers forthcoming :(. – jpp May 18 '18 at 22:04

Pandas how groupby and then count the number of unique items in each column?

1 Answers1