4

I like to have subtotal based on group, so it wont disturb in the grand total,

The code is

import pandas as pd
import numpy as np
dict_data = [{'duration': 0.7, 'project_id': 3, 'resource': u'Arya Stark', 'activity': u'Development'},
{'duration': 0.9, 'project_id': 4, 'resource': u'Ned Stark', 'activity': u'Development'},
{'duration': 2.88, 'project_id': 7, 'resource': u'Robb Stark', 'activity': u'Development'},
{'duration': 0.22, 'project_id': 9, 'resource': u'Jon Snow', 'activity': u'Support'},
{'duration': 0.3, 'project_id': 9, 'resource': u'Jon Snow', 'activity': u'Support'},
{'duration': 2.15, 'project_id': 3, 'resource': u'Arya Stark', 'activity': u'Practise'},
{'duration': 3.35, 'project_id': 4, 'resource': u'Sansa Stark', 'activity': u'Development'},
{'duration': 2.17, 'project_id': 9, 'resource': u'Rickon Stark', 'activity': u'Development'},
{'duration': 1.03, 'project_id': 4, 'resource': u'Benjan Stark', 'activity': u'Design'},
{'duration': 1.77, 'project_id': 4, 'resource': u'Bran Stark', 'activity': u'Testing'},
{'duration': 1.17, 'project_id': 4, 'resource': u'Ned Stark', 'activity': u'Development'},
{'duration': 0.17, 'project_id': 9, 'resource': u'Jon Snow', 'activity': u'Support'},
{'duration': 1.77, 'project_id': 3, 'resource': u'catelyn stark', 'activity': u'Development'},
{'duration': 0.3, 'project_id': 9, 'resource': u'Jon Snow', 'activity': u'Support'},
{'duration': 0.45, 'project_id': 9, 'resource': u'Jon Snow', 'activity': u'Support'}]
df = pd.DataFrame(dict_data)
pvt = pd.pivot_table(df, values=['duration'],index=['project_id','resource'], columns=['activity'], aggfunc=np.sum,margins=True, fill_value=0)

So I am expecting output as in below pattern:

enter image description here

The problem is I can't able to append the subtotal row to the table

2 Answers2

2

Reshape by unstack for creating new last row subtotal with sum, but need filter out all All columns. Then stack, swaplevel and sort_index. Columns are sorted, so get All column to last position by subset:

pvt = pvt.unstack(0)
mask = pvt.columns.get_level_values('project_id') != 'All'
#print (mask)
pvt.loc['subtotal'] = pvt.loc[:, mask].sum()
pvt = pvt.stack().swaplevel(0,1).sort_index()
pvt = pvt[pvt.columns[1:].tolist() + pvt.columns[:1].tolist()]
print (pvt)
                         duration                                            
activity                   Design Development Practise Support Testing    All
project_id resource                                                          
3          Arya Stark        0.00        0.70     2.15    0.00    0.00   2.85
           catelyn stark     0.00        1.77     0.00    0.00    0.00   1.77
           subtotal          0.00        2.47     2.15    0.00    0.00   4.62
4          Benjan Stark      1.03        0.00     0.00    0.00    0.00   1.03
           Bran Stark        0.00        0.00     0.00    0.00    1.77   1.77
           Ned Stark         0.00        2.07     0.00    0.00    0.00   2.07
           Sansa Stark       0.00        3.35     0.00    0.00    0.00   3.35
           subtotal          1.03        5.42     0.00    0.00    1.77   8.22
7          Robb Stark        0.00        2.88     0.00    0.00    0.00   2.88
           subtotal          0.00        2.88     0.00    0.00    0.00   2.88
9          Jon Snow          0.00        0.00     0.00    1.44    0.00   1.44
           Rickon Stark      0.00        2.17     0.00    0.00    0.00   2.17
           subtotal          0.00        2.17     0.00    1.44    0.00   3.61
All                          1.03       12.94     2.15    1.44    1.77  19.33
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

If you are interested I have just created a little function to make it more easy as you might want to apply this code 'subtotal' on many table. It works for both table created via pivot_table() and groupby().

def get_subtotal(table, sub_total='subtotal', get_total=False, total='TOTAL'):
    """
    Parameters
    ----------
    table : dataframe, table with multi-index resulting from pd.pivot_table() or 
    df.groupby().
    sub_total : str, optional
        Name given to the subtotal. The default is '_Sous-total'.
    get_total : boolean, optional
        Precise if you want to add the final total (in case you used groupeby()). 
        The default is False.
    total : str, optional
        Name given to the total. The default is 'TOTAL'.

    Returns
    -------
    A table with the total and subtotal added.
    """
    index_name1 = table.index.names[0]
    index_name2 = table.index.names[1]
    
    pvt = table.unstack(0)
    mask = pvt.columns.get_level_values(index_name1) != 'All'
    #print (mask)
    pvt.loc[sub_total] = pvt.loc[:, mask].sum()
    
    pvt = pvt.stack().swaplevel(0,1).sort_index()
    pvt = pvt[pvt.columns[1:].tolist() + pvt.columns[:1].tolist()]
    
    if get_total:
        mask = pvt.index.get_level_values(index_name2) != sub_total
        pvt.loc[(total, '' ),: ] = pvt.loc[mask].sum()
    print (pvt)
    return(pvt)
Tomtomtom
  • 1
  • 2