0

I have a dataframe

df = pd.DataFrame(data=[[1,0],[1,0],[2,0],[2,1]],columns=['day','class'])

and I would like to count the instances of class 1 each day. I use groupby in this way,

df.groupby(['class','day'])['class'].count()

Out[51]: 
day  class
1    0        2
2    0        1
     1        1
Name: class, dtype: int64

but I would like to have also that in day 1 class 1 there are no instances:

Out[51]: 
day  class
1    0        2
     1        0
2    0        1
     1        1
Name: class, dtype: int64
jpp
  • 159,742
  • 34
  • 281
  • 339
gabboshow
  • 5,359
  • 12
  • 48
  • 98
  • Possible duplicate of [Pandas groupby for zero values](https://stackoverflow.com/questions/37003100/pandas-groupby-for-zero-values) – rudolfbyker Mar 06 '18 at 09:31

3 Answers3

1

Add unstack with parameter fill_value=0 and stack:

df = df.groupby(['day','class'])['class'].count().unstack(fill_value=0).stack()
print (df)
day  class
1    0        2
     1        0
2    0        1
     1        1
dtype: int64
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

With pivot_table, even if less elegant than jezrael's solution:

df['class1'] = df['class']
df = df.pivot_table(index='class', columns='day', values='class1',
                 fill_value=0, aggfunc='count').unstack()

Output:

day  class
1    0        2
     1        0
2    0        1
     1        1
Joe
  • 12,057
  • 5
  • 39
  • 55
0

Here is one way. Categories ensure that when you perform a groupby operation, every combination is maintained.

This is a more data-oriented versus operation-oriented solution.

df = pd.DataFrame(data=[[1,0], [1,0], [2,0], [2,1]],
                  columns=['day', 'class'],
                  dtype='category')

df['count'] = 1
res = df.groupby(['class', 'day'], as_index=False)['count'].sum()
res['count'] = res['count'].fillna(0)

#   class day  count
# 0     0   1    2.0
# 1     0   2    1.0
# 2     1   1    0.0
# 3     1   2    1.0
jpp
  • 159,742
  • 34
  • 281
  • 339