2

I am creating a pivot table with Pandas but got stuck at the subtotals for rows within different columns (under the same Values) for a while, I've done some research on stackoverflow such as Pandas Pivot tables row subtotals and this Subtotals for Pandas pivot table index and column (actually this one is close to but some what also different from my case) but didn't find right hint for my case, so please for this topic, thanks.

Here I will provide a simplified DataFrame (original one is too big to show here, so their values really do not matter, the format is what I am looking for) of my case and the code I wrote to get my pivot table.

import pandas as pd
import numpy as np
df = pd.DataFrame(
{
    'Co':['NN']*6,
    'Op':['A','B']*3,
    'Stk':[1.1,1.2,1.3]*2,
    'Tm':['07-01-2018','08-01-2018','09-01-2018']*2,
    'Qt':[100,200,300]*2
}
)
df

The df output looks like:

Co  Op  Qt  Stk Tm
0   NN  A   100 1.1 07-01-2018
1   NN  B   200 1.2 08-01-2018
2   NN  A   300 1.3 09-01-2018
3   NN  B   100 1.1 07-01-2018
4   NN  A   200 1.2 08-01-2018
5   NN  B   300 1.3 09-01-2018

Then I convert above df to my pivot table by:

df['Qt'] = df['Qt'].apply(pd.to_numeric)
df['Stk']=df['Stk'].apply(pd.to_numeric)
df['Co'] = df['Co'].astype(str)
tb=pd.pivot_table(df,index=["Tm"],columns=["Co","Op","Stk"],aggfunc=np.sum,values=['Qt'], fill_value=0, margins=True, margins_name='All')
tb

The generated pivot table looks like:

            Qt
Co          NN                              All
Op          A              B    
Stk         1.1  1.2  1.3  1.1  1.2  1.3    
        Tm                          
07-01-2018  100  0    0    100  0    0      200
08-01-2018  0    200  0    0    200  0      400
09-01-2018  0    0    300  0    0    300    600
       All  100  200  300  100  200  300    1200

The format which I really expect is:

            Qt
Co          NN                                                All
Op          A              ATotal   B               BTotal
Stk         1.1  1.2  1.3           1.1  1.2  1.3   
        Tm                          
07-01-2018  100  0    0    100      100  0    0     100       200
08-01-2018  0    200  0    200      0    200  0     200       400
09-01-2018  0    0    300  300      0    0    300   300       600
       All  100  200  300  600      100  200  300   600       1200

I was trying to create this exact same format for a while and still can't get the same one (I tried something like creating two separate A and B pivot tables and union them together, but it will mess with the All margins). So help is really needed here. p.s. I am still new to stackoverflow community, so please pardon if my question is missing some aspects, thank you.

Gin
  • 129
  • 2
  • 12

1 Answers1

2

pivot_table doesn't support it, but you can compute it yourself and concatenate it later:

tb.groupby(level='Op', axis=1).sum().add_suffix('Total')

Op          Total  ATotal  BTotal
Tm                               
07-01-2018    200     100     100
08-01-2018    400     200     200
09-01-2018    600     300     300
All          1200     600     600
cs95
  • 379,657
  • 97
  • 704
  • 746
  • Thanks for the answer, I actually got sth similar as yours by concatenate two pivot tables (two pivot table with different columns).... So I'm exploring if there's any option to incorporate them inside the pivot table with the exactly same format (I am also new to Pandas though)... – Gin Jun 06 '18 at 23:07
  • not quite yet.. as I want to keep all the other pivot table format the same as it is while just adding two subtotals in it. – Gin Jun 06 '18 at 23:12
  • @Jay Can't have the cake and eat it too my friend. – cs95 Jun 06 '18 at 23:12
  • in Excel I know we can create a sub pivot table inside a major pivot table which is what I would expect, so you wouldn't believe the Pandas can do something similar? Thanks. – Gin Jun 06 '18 at 23:16
  • @Jay if you're asking for level-wise margins, I've already told you pivot_table doesn't support this functionality yet. Maybe they've planned it for a future release, or maybe they haven't planned it at all. Best thing to do would be to open a feature request on the github repo and make your case. Until they implement such a feature, your only option is the workarounds. What workaround you use is upto you. I've given you one. You may prefer another. – cs95 Jun 06 '18 at 23:18
  • Thank you @coldspeed, I would go to github and try to open a request there. – Gin Jun 06 '18 at 23:23
  • Forgot to ask...@coldspeed do you know if there's a way to add two calculated columns inside the originally created pivot table and to keep the original format the same? I had some attempt on this, but didn't get the solution. – Gin Jun 06 '18 at 23:31
  • @Jay: Yeah, take this result for example: `res = tb.groupby....; df = pd.concat([tb, res], axis=1)` You can play around with the ordering. – cs95 Jun 06 '18 at 23:32