8

I have:

df = pd.DataFrame({'A':[1, 2, -3],'B':[1,2,6]})
df
    A   B
0   1   1
1   2   2
2   -3  6

Q: How do I get:

    A
0   1
1   2
2   1.5

using groupby() and aggregate()?

Something like,

df.groupby([0,1], axis=1).aggregate('mean')

So basically groupby along axis=1 and use row indexes 0 and 1 for grouping. (without using Transpose)

Ankur Agarwal
  • 23,692
  • 41
  • 137
  • 208
  • 1
    Are you, by any chance, looking for just `df.apply(pd.Series.mean, 1)`? You can also get a dataframe out of this with `df.apply(pd.Series.mean, 1).to_frame('A')`. – Abdou Dec 24 '17 at 21:00

4 Answers4

3

Are you looking for ?

df.mean(1)
Out[71]: 
0    1.0
1    2.0
2    1.5
dtype: float64

If you do want groupby

df.groupby(['key']*df.shape[1],axis=1).mean()
Out[72]: 
   key
0  1.0
1  2.0
2  1.5
BENY
  • 317,841
  • 20
  • 164
  • 234
  • 1
    But I'd like to be able to specify row indexes as first argument to groupby (analogous to specifying column indexes as first argument when doing groupby with axis=0). You see what I mean ? – Ankur Agarwal Dec 25 '17 at 00:07
3

Grouping keys can come in 4 forms, I will only mention the first and third which are relevant to your question. The following is from "Data Analysis Using Pandas":

Each grouping key can take many forms, and the keys do not have to be all of the same type:

• A list or array of values that is the same length as the axis being grouped

•A dict or Series giving a correspondence between the values on the axis being grouped and the group names

So you can pass on an array the same length as your columns axis, the grouping axis, or a dict like the following:

df1.groupby({x:'mean' for x in df1.columns}, axis=1).mean()

    mean
0   1.0
1   2.0
2   1.5
Community
  • 1
  • 1
iDrwish
  • 3,085
  • 1
  • 15
  • 24
  • code line can be reduced by using ```df1.groupby([1,1], axis=1).mean()``` OR ```df1.groupby(['SS','SS'], axis=1).mean()``` but @iDrwish code is more readable as dict clearly says the mapping – sakeesh Aug 26 '22 at 10:55
3

Given the original dataframe df as follows -

   A  B  C
0  1  1  2
1  2  2  3
2 -3  6  1

Please use command

df.groupby(by=lambda x : df[x].loc[0],axis=1).mean()

to get the desired output as -

     1    2
0  1.0  2.0
1  2.0  3.0
2  1.5  1.0

Here, the function lambda x : df[x].loc[0] is used to map columns A and B to 1 and column C to 2. This mapping is then used to decide the grouping.

You can also use any complex function defined outside the groupby statement instead of the lambda function.

Kamini
  • 31
  • 3
-1

try this:

df["A"] = np.mean(dff.loc[:,["A","B"]],axis=1)
df.drop(columns=["B"],inplace=True)
      A
 0   1.0
 1   2.0
 2   1.5
doubleTnT
  • 1
  • 1