Please note that agg()
function cannot be used for calculation involving multiple columns. You have to use apply()
function instead. Refer to this post for details.
if you had multiple columns that needed to interact together then you
cannot use agg, which implicitly passes a Series to the aggregating
function. When using apply the entire group as a DataFrame gets passed
into the function.
For your case, you have to define a customized function as follows:
def f(x):
data = {}
data['Total pre discount'] = (x['price'] * x['count']).sum()
data['Discount applied'] = (x['price'] * x['count'] * x['discount']).sum()
return pd.Series(data)
Then perform your desired task by:
df.groupby('client_name').apply(f)
or if you want to use lambda function instead of customized function:
df.groupby('client_name').apply(lambda x: pd.Series({'Total pre discount': (x['price'] * x['count']).sum(), 'Discount applied': (x['price'] * x['count'] * x['discount']).sum()}))
Run Demonstration
Test Data Creation
data = {'id': ['0100', '0200', '0100', '0200', '0300'], 'client_name': ['Ann', 'Bob', 'Ann', 'Bob', 'Charles'], 'product': ['pen', 'paper', 'folder', 'pencil', 'tray'], 'count': [12, 300, 5, 12, 10], 'price': [2.00, 5.00, 3.50, 2.30, 8.20], 'discount': [0.0, 0.1, 0.15, 0.1, 0.12]}
df = pd.DataFrame(data)
print(df)
Output:
id client_name product count price discount
0 0100 Ann pen 12 2.0 0.00
1 0200 Bob paper 300 5.0 0.10
2 0100 Ann folder 5 3.5 0.15
3 0200 Bob pencil 12 2.3 0.10
4 0300 Charles tray 10 8.2 0.12
Run New Codes
# Use either one of the following 2 lines of codes:
summary = df.groupby('client_name').apply(f) # Using the customized function f() defined above
# or using lambda function
summary = df.groupby('client_name').apply(lambda x: pd.Series({'Total pre discount': (x['price'] * x['count']).sum(), 'Discount applied': (x['price'] * x['count'] * x['discount']).sum()}))
print(summary)
Output:
Total pre discount Discount applied
client_name
Ann 41.5 2.625
Bob 1527.6 152.760
Charles 82.0 9.84