-5

I have a pandas dataframe df with columns [a, b, c, d, e, f]. I want to perform a group by on df. I can best describe what it's supposed to do in SQL:

SELECT a, b, min(c), min(d), max(e), sum(f)
FROM df
GROUP BY a, b 

How do I do this group by using pandas on my dataframe df?

consider df:

a  b  c  d  e  f
1  1  2  5  9  3    
1  1  3  3  4  5  
2  2  4  7  4  4 
2  2  5  3  8  8 

I expect the result to be:

a  b  c  d  e  f
1  1  2  3  9  8    
2  2  4  3  8  12 
piRSquared
  • 285,575
  • 57
  • 475
  • 624
Filip Eriksson
  • 975
  • 7
  • 30
  • 47
  • Please provide a sample dataframe and expected output. – Fabio Lamanna Dec 21 '16 at 18:15
  • dupe: http://stackoverflow.com/questions/33217702/groupby-in-pandas-with-different-functions-for-different-columns and this: http://stackoverflow.com/questions/30674708/how-to-apply-different-aggregation-functions-to-same-column-by-using-pandas-grou – EdChum Dec 21 '16 at 23:33

1 Answers1

1

use agg

df = pd.DataFrame(
    dict(
        a=list('aaaabbbb'),
        b=list('ccddccdd'),
        c=np.arange(8),
        d=np.arange(8),
        e=np.arange(8),
        f=np.arange(8),
    )
)

funcs = dict(c='min', d='min', e='max', f='sum')
df.groupby(['a', 'b']).agg(funcs).reset_index()

   a  b  c  e   f  d
0  a  c  0  1   1  0
1  a  d  2  3   5  2
2  b  c  4  5   9  4
3  b  d  6  7  13  6

with your data

   a  b  c  e   f  d
0  1  1  2  9   8  3
1  2  2  4  8  12  3
piRSquared
  • 285,575
  • 57
  • 475
  • 624