1

I have a set of data and I would like to count the occurrence of 'Bad' for a same number, and the result should show the count of 'Bad' in a new column, which is same function the COUNTIFS excel formula =COUNTIFS(N:N,N2,O:O,"Bad") example of  the result

I have try to use the code below:

countbad = df.groupby('No')['Status'].value_counts().reset_index(name='count')

df['CountBad'] = countbad('count')

but the result I get is include counting based on Good/Moderate as well.

result got from the codes

Is there any way to made the 'Bad' as the condition to count the number of occurrence and get exactly what I needed without changing the dataframe structure?

Thanks

aminography
  • 21,986
  • 13
  • 70
  • 74
BBBBBBBB
  • 165
  • 1
  • 10
  • kindly post a reproducible example. see [guide](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – sammywemmy Apr 10 '20 at 05:37

1 Answers1

1

First compare column by Series.eq with Bad for mask and convert to numeric 0,1 by Series.astype or Series.view and then count values by GroupBy.transform for new column with sum:

df = pd.DataFrame({
        'No':list('aabbccd'),
         'Status':['Bad', 'Good', 'Bad', 'Bad', 'Good', 'Good', 'Bad'],

})

df['CountBad'] = df['Status'].eq('Bad').astype(int).groupby(df['No']).transform('sum')
#alternative
df['CountBad'] = df['Status'].eq('Bad').view('i1').groupby(df['No']).transform('sum')
print (df)

  No Status  CountBad
0  a    Bad         1
1  a   Good         1
2  b    Bad         2
3  b    Bad         2
4  c   Good         0
5  c   Good         0
6  d    Bad         1

Your solution is possible by reshape with Series.unstack:

df1 = df.groupby('No')['Status'].value_counts().unstack(fill_value=0)

And then Series.map:

df['CountBad'] = df['No'].map(df1['Bad'])

If want all counts columns use DataFrame.join with DataFrame.add_prefix:

df2 = df.join(df1.add_prefix('Count'), on='No')
print (df2)
  No Status  CountBad  CountGood
0  a    Bad         1          1
1  a   Good         1          1
2  b    Bad         2          0
3  b    Bad         2          0
4  c   Good         0          2
5  c   Good         0          2
6  d    Bad         1          0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • can i know how to modify if i would like to count O2 as condition for row 2, the O3 for row 3 and so on? same as the excel formula: ```=COUNTIFS(N:N,N2,O:O,O2)``` – BBBBBBBB May 29 '20 at 04:30
  • @BBBBBBBB - Sorry, what it means? How should be changed `df = pd.DataFrame({ 'No':list('aabbccd'), 'Status':['Bad', 'Good', 'Bad', 'Bad', 'Good', 'Good', 'Bad'], })` if applied `=COUNTIFS(N:N,N2,O:O,O2)` ? thanks. – jezrael May 29 '20 at 05:06
  • kinda different: ```df = pd.DataFrame({'Al': list('aabccdd'),'Status': ['Bad','Good','ok','Bad','good','ok','ok']``` result will be ```Count: 1,1,1,1,1,2,2```, meaning count Status in status based on count in Al in Al, something like that, sorry if it is confusing, thanks – BBBBBBBB May 29 '20 at 05:36
  • @BBBBBBBB - do you think `df['Count'] = df.groupby(['Al', 'Status'])['Status'].transform('size')` ? – jezrael May 29 '20 at 05:40
  • 1
    it works, i did not think of using .transform('size'), thanks – BBBBBBBB May 29 '20 at 05:55