Group values for groupby().mean()

Question

Have a Dataframe:

Column_A	Column_B
1	20
2	25
1	52
2	22
4	67
1	34
3	112
5	55
4	33
5	87
1	108

Looking to create 2 groups from Column_A, and find the average of those groups in Column_B:

So first group might be 1, 2 and 3, second group 4 and 5.

I get the basics behind groupby()

df.groupby("Column_A")["Column_B"].mean()

and calling certain values in columns

df[df["Column_A"] == 1].groupby()[].mean()

But is there a way to include the group of (1, 2 and 3) and (4, 5) from Column_A? Somehow doing:

[["Column_A"] == 1, 2, 3].groupby(Column_B).mean()

And:

[["Column_A"] == 4, 5].groupby(Column_B).mean()

Thanks in advance

check `.isin` https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isin.html — Epsi95, Nov 15 '22 at 05:32

score 1 · Answer 1 · answered Nov 15 '22 at 06:13

1

You can combine cut to bin the first column, then groupby.mean:

(df.groupby(pd.cut(df['Column_A'], [0,3,5], labels=['1-3', '4-5']))
   ['Column_B'].mean()
 )

Output:

Column_A
1-3    53.285714
4-5    60.500000
Name: Column_B, dtype: float64

answered Nov 15 '22 at 06:13

mozway

194,879
13
39
75

score 0 · Answer 2 · answered Nov 15 '22 at 05:37

0

df[df["Column_A"] <= 3].groupby("Column_A")["Column_B"].mean()
df[df["Column_A"] > 3].groupby("Column_A")["Column_B"].mean()

if Column_A is not numeric, use isin

answered Nov 15 '22 at 05:37

Panda Kim

6,246
2
12

score 0 · Answer 3 · answered Nov 15 '22 at 05:51

0

Using isin if you only want a specific values group (1,2,3 and 4,5)

df[df["Column_A"].isin([1,2,3])].groupby("Column_A")["Column_B"].mean()
df[df["Column_A"].isin([4,5])].groupby("Column_A")["Column_B"].mean()

isin doc

answered Nov 15 '22 at 05:51

iamjaydev

142
8

DarrylG · Accepted Answer · 2022-11-18T10:33:02.643

0

Looking to create 2 groups from Column_A, and find the average of those groups in Column_B

You can use a condition in groupby for the items in [1, 2, 3] vs. not in the list i.e. [4, 5].

df.groupby(df['Column_A'].isin([1, 2, 3]))['Column_B'].mean()

Output:
Column_A
False    60.500000
True     53.285714
Name: Column_B, dtype: float64

what if i wanted to find the .count() of Column_B = 25 under the same isin conditions

# Is this what you wanted?
df.groupby(df['Column_A'].isin([1, 2, 3]) & (df['Column_B']==25))['Column_B'].count()

Output:
False    10
True      1
Name: Column_B, dtype: int64

edited Nov 18 '22 at 10:33

answered Nov 15 '22 at 11:15

DarrylG

16,732
2
17
23

This worked, thank you. I'd upvote but don't have the rep yet. what if i wanted to find the .count() of Column_B = 25 under the same isin conditions, for example? I think im putting brackets in the wrong place. – MB2000 Nov 18 '22 at 03:33
@MB2000 1) does the updated answer help? 2) you can accept an answer even if you don't have the rep to upvote. – DarrylG Nov 18 '22 at 10:34

Group values for groupby().mean()

4 Answers4