1

Have a Dataframe:

Column_A Column_B
1 20
2 25
1 52
2 22
4 67
1 34
3 112
5 55
4 33
5 87
1 108

Looking to create 2 groups from Column_A, and find the average of those groups in Column_B:

So first group might be 1, 2 and 3, second group 4 and 5.

I get the basics behind groupby()

df.groupby("Column_A")["Column_B"].mean()

and calling certain values in columns

df[df["Column_A"] == 1].groupby()[].mean()

But is there a way to include the group of (1, 2 and 3) and (4, 5) from Column_A? Somehow doing:

[["Column_A"] == 1, 2, 3].groupby(Column_B).mean()

And:

[["Column_A"] == 4, 5].groupby(Column_B).mean()

Thanks in advance

BeRT2me
  • 12,699
  • 2
  • 13
  • 31
MB2000
  • 13
  • 3
  • check `.isin` https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isin.html – Epsi95 Nov 15 '22 at 05:32

4 Answers4

1

You can combine cut to bin the first column, then groupby.mean:

(df.groupby(pd.cut(df['Column_A'], [0,3,5], labels=['1-3', '4-5']))
   ['Column_B'].mean()
 )

Output:

Column_A
1-3    53.285714
4-5    60.500000
Name: Column_B, dtype: float64
mozway
  • 194,879
  • 13
  • 39
  • 75
0
df[df["Column_A"] <= 3].groupby("Column_A")["Column_B"].mean()
df[df["Column_A"] > 3].groupby("Column_A")["Column_B"].mean()

if Column_A is not numeric, use isin

Panda Kim
  • 6,246
  • 2
  • 12
0

Using isin if you only want a specific values group (1,2,3 and 4,5)

df[df["Column_A"].isin([1,2,3])].groupby("Column_A")["Column_B"].mean()
df[df["Column_A"].isin([4,5])].groupby("Column_A")["Column_B"].mean()

isin doc

iamjaydev
  • 142
  • 8
0

Looking to create 2 groups from Column_A, and find the average of those groups in Column_B

You can use a condition in groupby for the items in [1, 2, 3] vs. not in the list i.e. [4, 5].

df.groupby(df['Column_A'].isin([1, 2, 3]))['Column_B'].mean()

Output:
Column_A
False    60.500000
True     53.285714
Name: Column_B, dtype: float64

what if i wanted to find the .count() of Column_B = 25 under the same isin conditions

# Is this what you wanted?
df.groupby(df['Column_A'].isin([1, 2, 3]) & (df['Column_B']==25))['Column_B'].count()

Output:
False    10
True      1
Name: Column_B, dtype: int64
DarrylG
  • 16,732
  • 2
  • 17
  • 23
  • This worked, thank you. I'd upvote but don't have the rep yet. what if i wanted to find the .count() of Column_B = 25 under the same isin conditions, for example? I think im putting brackets in the wrong place. – MB2000 Nov 18 '22 at 03:33
  • @MB2000 1) does the updated answer help? 2) you can accept an answer even if you don't have the rep to upvote. – DarrylG Nov 18 '22 at 10:34