3

I am using the following code to groupby and count/sum etc.

groups = df[df['isTrade'] == 1].groupby('dateTime')                         
grouped = (groups.agg({'tradeBid': [np.sum,lambda x: (x > 0).sum()],})) 

The output is giving me:

tradeBid    tradeBid
sum <lambda>

79  46
7   6
4   4
20  6

How can I change the output's header ( so my end user will know what is this data?

Giladbi
  • 1,822
  • 3
  • 19
  • 34

1 Answers1

11

You can provide names like this:

groups.agg({
  'tradeBid': [
    ('sum', np.sum),
    ('other', lambda x: (x > 0).sum())
  ]
})

It used to be you could use a dict instead of a list of 2-tuples, but that is now deprecated (probably because the ordering of the columns is then arbitrary).

Harry Moreno
  • 10,231
  • 7
  • 64
  • 116
John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • Is this documented somewhere? – ayhan Jan 28 '18 at 10:56
  • 4
    @ayhan: The docs say `agg()` accepts "dict of column names -> functions (or list of functions)" but does not say that a list of 2-tuples is an acceptable substitute. Nor does it say that using a dict now results in a deprecation warning. But I know that many things in NumPy/Pandas which could use a dict can also use a list of (name, value) tuples. So I tried it and it worked. So no, it's not documented. :) – John Zwinck Jan 28 '18 at 10:59
  • Yeah I have never seen it so I thought maybe they added this after deprecating dict renaming but it seems it was always possible. Good to know. :) – ayhan Jan 28 '18 at 11:01