Similar to this previous post, I would like to derive the percentage within each group but based on the sum of multiple columns and add subtotals. For example given the dataframe below:
import numpy as np
import pandas as pd
np.random.seed(0)
df = pd.DataFrame({'state': ['CA', 'WA', 'CO', 'AZ'] * 3,
'office_id': list(range(1, 7)) * 2,
'sales': [np.random.randint(100000, 999999) for _ in range(12)],
'sales2': [np.random.randint(100000, 999999) for _ in range(12)],
'sales3': [np.random.randint(100000, 999999) for _ in range(12)]})
The ideal results would yield:
Update
It would be ideal to groupby both state and office id for situations where there are repeating values for office id column. Here is an example:
df = pd.DataFrame({'state': ['CA', 'WA', 'CO', 'AZ'] * 3,
'office_id': [1,1,1,2,2,2] * 2,
'sales': [np.random.randint(100000, 999999) for _ in range(12)],
'sales2': [np.random.randint(100000, 999999) for _ in range(12)],
'sales3': [np.random.randint(100000, 999999) for _ in range(12)]})
This should then yield: