I've been looking for a few hours and can't seem to find a topic related to that exact matter.
So basically, I want to apply on a groupby to find something else than the mean. My groupby returns two columns 'feature_name' and 'target_name', and I want to replace the value in 'target_name' by something else : the number of occurences of 1, of 0, the difference between both, etc.
print(df[[feature_name, target_name]])
When I print my dataframe with the column I use, I get the following : screenshot
I already have the following code to compute the mean of 'target_name' for each value of 'feature_name':
df[[feature_name, target_name]].groupby([feature_name],as_index=False).mean()
Which returns : this.
And I want to compute different things than the mean. Here are the values I want to compute in the end : what I want
In my case, the feature 'target_name' will always be equal to either 1 or 0 (with 1 being 'good' and 0 'bad'.
I have seen this example from an answer.:
df.groupby(['catA', 'catB'])['scores'].apply(lambda x: x[x.str.contains('RET')].count())
But I don't know how to apply this to my case as x would be simply an int. And after solving this issue, I still need to compute more than just the count!
Thanks for reading ☺