How to aggregate subcategories in pandas.dataframe.pivot_table

Question

I created a pivot table that is meant to count the total number of rejections for Types A-K and their subtypes (A1-A12)

piv_table2 = df.loc[(Filter1) & (Filter2)].pivot_table(
    values=["Level3"], 
    index=["Level1", "Level2"], 
    columns=None,
    aggfunc="count",
    margins=True,
    margins_name="Total"
)

This is what a sample of my input df looks like:

WeekNumber, Level1, Level2, Level3
22, H, H1, Obs
22, H, H2, Exc
23, B, B10, Obs
23, A, A6, Obs

The week number columns are the week number, Level1 is the high level category, with Level2 being subcategories, & Level3 being "Obs" or "Exc"

piv_table2 looks like this:

# piv_table2 Output
                Level3
Level1  Level2  
A   
        A1-     175
        A10-    2
        A11-    10
        A12-    30
...

K   
        K1 -    12
        K11 -   12
        K11 -   13
Total           5170

Essentially, I'd like to count the sub categories (A - K) along with the complete total of everything from A-K. Ideally, the output for piv_table2 would give me the total for all As, all Bs, all Cs, etc. AND then the sum of all those counts. The desired output would look something like this:

# Desired Output with Level1 Totals & Complete Totals
                Level3
Level1  Level2  
A -             617 
        A1-     175
        A10-    2
        A11-    10
        A12-    30
...

K -             12  
        K1 -    10
        K11 -   1
        K11 -   1
Total           5170

The totals for A & K, along with the totals for EVERYTHING from A-K are shown in the desired output.

I tried using pivot_table2.merge(pivot_table) & vice versa with no success.

Welcome to Stack Overflow! Check out the [tour]. Please provide a [mre], meaning some example input data and the exact code you tried that didn't work, as well as desired output just to be clear. For specifics, see [How to make good reproducible pandas examples](/q/20109391/4518341). BTW, if you want more tips, check out [ask]. — wjandrea, Jul 19 '23 at 15:01
can you run df.to_dict() and post the results into your question! Thanks! :-) — Mark, Jul 19 '23 at 15:48
@Mark Based on the "Total", OP's df has 5170 rows -- too many to post. That's why I'm asking for an MRE myself. — wjandrea, Jul 19 '23 at 15:52
Please trim your code to make it easier to find your problem. Follow these guidelines to create a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). — Ahmad, Jul 19 '23 at 16:46

score 0 · Answer 1 · answered Jul 19 '23 at 19:06

If I understand you correctly you want to add Total to Level2 index and then final "grand-total":

x = df.pivot_table(index=['Level1', 'Level2'], values='Level3', aggfunc='count')

# add "Total" to level2
total = x.groupby(level=0)['Level3'].sum().to_frame()
total.index = pd.MultiIndex.from_product([total.index, ['Total']])

x = pd.concat([x, total]).sort_index()

# add all Total
all_total = pd.DataFrame({'Level3': x.xs('Total', level=1).sum().squeeze()}, index=pd.MultiIndex.from_tuples([('Total', '')]))

x = pd.concat([x, all_total]).sort_index()

print(x)

Prints:

               Level3
Level1 Level2        
A      A6           4
       A7           1
       A8           1
       Total        6
B      B10          2
       B11          1
       Total        3
H      H1           1
       H2           1
       H3           3
       Total        5
Total              14

Initial df:

    WeekNumber Level1 Level2 Level3
0           22      H     H1    Obs
1           22      H     H2    Exc
2           22      H     H3    Exc
3           22      H     H3    Exc
4           22      H     H3    Exc
5           23      B    B10    Obs
6           23      B    B10    Obs
7           23      B    B11    Obs
8           23      A     A6    Obs
9           23      A     A6    Obs
10          23      A     A6    Obs
11          23      A     A6    Obs
12          23      A     A7    Obs
13          23      A     A8    Obs

1

Worked great, thanks – Adrian Abella Jul 21 '23 at 16:05

How to aggregate subcategories in pandas.dataframe.pivot_table

1 Answers1