In the example from the pandas documentation about the new .pipe()
method for GroupBy objects, an .apply()
method accepting the same lambda would return the same results.
In [195]: import numpy as np
In [196]: n = 1000
In [197]: df = pd.DataFrame({'Store': np.random.choice(['Store_1', 'Store_2'], n),
.....: 'Product': np.random.choice(['Product_1', 'Product_2', 'Product_3'], n),
.....: 'Revenue': (np.random.random(n)*50+10).round(2),
.....: 'Quantity': np.random.randint(1, 10, size=n)})
In [199]: (df.groupby(['Store', 'Product'])
.....: .pipe(lambda grp: grp.Revenue.sum()/grp.Quantity.sum())
.....: .unstack().round(2))
Out[199]:
Product Product_1 Product_2 Product_3
Store
Store_1 6.93 6.82 7.15
Store_2 6.69 6.64 6.77
I can see how the pipe
functionality differs from apply
for DataFrame objects, but not for GroupBy objects. Does anyone have an explanation or examples of what can be done with pipe
but not with apply
for a GroupBy?