pandas how to count column boolean value that based on group

Question

I have a Dataframe df ,you can have it by running the following code:

import pandas as pd
from io import StringIO
    
df = """  
    month       status_review             supply_review  
2023-01-01           False                    False                   
2023-01-01           True                     True                  
2022-12-01           False                    True                  
2022-12-01           True                     True                
2022-12-01           False                    False   
    """
df= pd.read_csv(StringIO(df.strip()), sep='\s\s+', engine='python')

How can I count how many status_reviews and supply_review are True in each month?

The output should look like the following:

    month       # of true status_review      # of true supply_review  
2023-01-01           1                         1              
2022-12-01           1                         2

It works ,do you mind posting it as an answer ,so that I can check,thanks! — William, Jan 18 '23 at 20:32

score 2 · Answer 1 · answered Jan 18 '23 at 20:34

2

You can use .groupby() and .sum(), which is more concise than using .agg():

df.groupby("month").sum()

This outputs:

            status_review  supply_review
month
2022-12-01              1              2
2023-01-01              1              1

answered Jan 18 '23 at 20:34

BrokenBenchmark

18,126
7
21
33

Thank you for you reply ,another question ,how can I only choose some of the column to do sum ,and keep other column value the same,at the same timie – William Jan 18 '23 at 21:31
Not sure. Perhaps this would be good for another question? – BrokenBenchmark Jan 18 '23 at 21:34

score 2 · Accepted Answer · answered Jan 18 '23 at 20:34

Assuming you have one date per month, like in this case, you can simply use groupby and sum all columns.

import pandas as pd
from io import StringIO
    
# setup sample data
df = """  
    month       status_review             supply_review  
2023-01-01           False                    False                   
2023-01-01           True                     True                  
2022-12-01           False                    True                  
2022-12-01           True                     True                
2022-12-01           False                    False   
    """
df= pd.read_csv(StringIO(df.strip()), sep='\s\s+', engine='python')

# sum all rows by month column
df.groupby('month').agg('sum')

If you have multiple dates per month, then you can convert the column to datetime type and then sum by month as follows:

df['month'] = pd.to_datetime(df['month'])
df.groupby([pd.Grouper(key='month', freq='M')]).agg('sum')

Golden Lion · Answer 3 · 2023-01-26T22:11:51.520

use groupby and value_counts() with a reset_index to create a dataframe column for the counts. find the p-value by using the wilcoxon

data = """  
    month       status_review             supply_review  
2023-01-01           False                    False                   
2023-01-01           True                     True                  
2022-12-01           False                    True                  
2022-12-01           True                     True                
2022-12-01           False                    False   
    """
df= pd.read_csv(StringIO(data.strip()), sep='\s\s+', engine='python')
print(df)

results_status=df.groupby('month')['status_review'].value_counts().reset_index(name='n').sort_values('month')
results_supply=df.groupby('month')['supply_review'].value_counts().reset_index(name='n').sort_values('month')

results=pd.concat([results_status,results_supply],axis=1)
print(results)

output

    month  status_review  n       month  supply_review  n
0  2022-12-01          False  2  2022-12-01           True  2
1  2022-12-01           True  1  2022-12-01          False  1
2  2023-01-01          False  1  2023-01-01          False  1
3  2023-01-01           True  1  2023-01-01           True  1

https://pingouin-stats.org/build/html/generated/pingouin.wilcoxon.html — Golden Lion, Jan 19 '23 at 16:05

pandas how to count column boolean value that based on group

3 Answers3