1

I've a dataframe with 3 levels of multi index columns:

quarter           Q1                        Q2                        Totals
year              2021        2022           2021         2022                      
                 qty orders  qty orders    qty orders   qty orders   qty orders
month name                                       
January          40  2        5   1         1   2         0 0             46  5
February         20  8        2   3         4   6         0 0             26  17
March            2  10        7   4         3   3         0 0             12  17
Totals           62 20       14   8         8   11        0 0             84  39

After doing a groupy by levels (0,2), I've the following subtotals dataframe:

quarter           Q1           Q2          Totals                     
                 qty orders  qty orders    qty orders  
month name                                       
January          45  3        1   2         46   5     
February         22  10       4   6         26   16     
March            9  14        3   3         12   17   
Totals           76 28        8   11        84   39

I need to insert the second into the first, without upsetting the columns, levels or index so that I get the following dataframe:

quarter       Q1                                   Q2                        Totals
year        2021        2022      Subtotal    2021        2022     Subtotal                 
            qty orders qty orders qty orders qty orders qty orders qty orders qty orders
month name                                       
January     40  2       5   1     45   3       1  2       0  0       1  2     46  5
February    20  8       2   3     22   10      4  6       0  0       4  6     26  16
March       2  10       7   4     9    14      3  3       0  0       3  3     12  17
Totals      62 20      14   8     76   28      8  11      0  0       8  11    84 39

How do I do this?

Laurent
  • 12,287
  • 7
  • 21
  • 37
Judy T Raj
  • 1,755
  • 3
  • 27
  • 41

1 Answers1

1

With your initial dataframe (before groupby):

import pandas as pd


df = pd.DataFrame(
    [
        [40, 2, 5, 1, 1, 2, 0, 0],
        [20, 8, 2, 3, 4, 6, 0, 0],
        [2, 10, 7, 4, 3, 3, 0, 0],
        [62, 20, 14, 8, 8, 11, 0, 0],
    ],
    columns=pd.MultiIndex.from_product(
        [("Q1", "Q2"), ("2021", "2022"), ("qty", "orders")]
    ),
    index=["January", "February", "March", "Totals"],
)

Here is one way to do it (using product from Python standard library's itertools module, otherwise a nested for-loop is also possible):

# Add new columns
for level1, level2 in product(["Q1", "Q2"], ["qty", "orders"]):
    df.loc[:, (level1, "subtotal", level2)] = (
        df.loc[:, (level1, "2021", level2)] + df.loc[:, (level1, "2022", level2)]
    )

# Sort columns
df = df.reindex(
    pd.MultiIndex.from_product(
        [("Q1", "Q2"), ("2021", "2022", "subtotal"), ("qty", "orders")]
    ),
    axis=1,
)

Then:

print(df)
# Output
           Q1                                      Q2                     \
         2021        2022        subtotal        2021        2022
          qty orders  qty orders      qty orders  qty orders  qty orders   
January    40      2    5      1       45      3    1      2    0      0   
February   20      8    2      3       22     11    4      6    0      0   
March       2     10    7      4        9     14    3      3    0      0   
Totals     62     20   14      8       76     28    8     11    0      0   


         subtotal
              qty orders  
January         1      2  
February        4      6  
March           3      3  
Totals          8     11  
Laurent
  • 12,287
  • 7
  • 21
  • 37
  • Is there a way to generalize this for any number of levels if the groupby is always by the first and last levels? – Judy T Raj Dec 13 '22 at 12:26
  • Probably, but I don't see any easy way to do that. This deserves a new, separate question. Cheers. – Laurent Dec 13 '22 at 20:19