How can I add new columns to a dataframe with grouped amounts in percentage?

Question

This dataframe is given:

data = {'dates': ['01.05.2020', '01.05.2020', '01.05.2020', '02.05.2020', '02.05.2020', '03.05.2020', '03.05.2020'], 
        'amount' : [500,1000, 2000,3000,1000,4000, 2300], 'state' : ['ABC', 'ERT', 'ABC', 'XYZ', 'ERT' , 'ZEQ', 'ERT']}

In the column state we have four differerent values (ABC, ERT, XYZ & ZEQ). Now my goal is to add new columns for the amount per state in percentage for each day. Hence, we need to group by dates and then calculate the daily amount in percentage for each state.

So e.g. the first line of the new data frame should look like:

How can I do this in the easiest way?

jezrael · Accepted Answer · 2020-08-24T11:57:50.787

1

Use DataFrame.pivot_table for pivoting with sum and then divide by summed values per rows:

df = pd.DataFrame(data)

df1 = df.pivot_table(index='dates', 
                    columns='state', 
                    values='amount', 
                    fill_value=0, 
                    aggfunc='sum')
df1 = df1.div(df1.sum(axis=1), axis=0).round(2)
print (df1)
state        ABC   ERT   XYZ   ZEQ
dates                             
01.05.2020  0.71  0.29  0.00  0.00
02.05.2020  0.00  0.25  0.75  0.00
03.05.2020  0.00  0.37  0.00  0.63

edited Aug 24 '20 at 11:57

answered Aug 24 '20 at 11:48

jezrael

822,522
95
1,334
1,252

One question regarding the function "div". I understand that with sum(axis=1) we add up the values in each column (in the first line 0.71 +0.29 +0+0). However, what is the second attribute (axis=0) for? – Minfetli Aug 24 '20 at 15:10
@Minfetli You can check [this](https://stackoverflow.com/q/53217607/2901002) – jezrael Aug 24 '20 at 16:56

How can I add new columns to a dataframe with grouped amounts in percentage?

1 Answers1