2

I have the following pandas dataframe:

df = pd.DataFrame([[1,2,3,'a'],[4,5,6,'a'],[2,4,1,'a'],[2,4,1,'b'],[4,9,6,'b'],[2,4,1,'b']], index=[0,1,2,0,1,2], columns=['aa','bb','cc','cat'])


     aa    bb    cc    cat
0    1      2     3    a
1    4      5     6    a
2    2      4     1    a
0    2      4     1    b
1    4      9     6    b
2    2      4     1    b

I need to add rows with the same index.

    aa   bb   cc  cat
0   3    6    4    ab
1   8   14   12    ab
2   4    8    2    ab

I used the following code:

df_ab = df[df['cat'] == 'a'] + df[df['cat'] == 'b']

But is this the most pythonic way ?

m13op22
  • 2,168
  • 2
  • 16
  • 35
jAguesses
  • 91
  • 5

3 Answers3

5

Use groupby and agg

df.groupby(df.index).agg({'aa': 'sum',
                          'bb': 'sum',
                          'cc': 'sum',
                          'cat': ''.join})

Or pass numeric_only=False (simpler, but I wouldn't recommend)

df.groupby(df.index).sum(numeric_only=False)

Both output

    aa  bb  cc cat
0   3   6   4  ab
1   8  14  12  ab
2   4   8   2  ab
rafaelc
  • 57,686
  • 15
  • 58
  • 82
3

We can select the dtype of column and determined which type of agg function to use

df.groupby(level=0).agg(lambda x : x.sum() if x.dtype!='object' else ''.join(x))
Out[271]: 
   aa  bb  cc cat
0   3   6   4  ab
1   8  14  12  ab
2   4   8   2  ab
BENY
  • 317,841
  • 20
  • 164
  • 234
0

Use this one-liner :)

(df.reset_index().groupby("index")
 .agg(lambda x:np.sum(x) if x.dtype == "int" else "".join(x)) 
ivallesp
  • 2,018
  • 1
  • 14
  • 21