0

I am trying to add a sum to my multiindex dataframe by each grouping

                          Count
state    car   status
texas    civic New           11
               undamaged     11
               damaged       10
               totalled       5
virginia civic New           10
               undamaged     20
               damaged       10
               totalled       5

I want it to look like:

                          Count
state    car   status
texas    civic New           11
               damaged       10
               totalled       5
               undamaged     11
               total         37
virginia civic New           10
               damaged       10
               totalled       5
               undamaged     20
               total         45

I have tried

s = test.groupby(level=[0,1]).sum()
s.index = pd.MultiIndex.from_product([s.index, ['Total']])

df_out = df_full.append(s).sort_index()

but it throws

NotImplementedError: isna is not defined for MultiIndex

Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
Frank R.
  • 15
  • 4

2 Answers2

0

You problem is that pd.MultiIndex.from_product doesn't support product between multindex and list, instead you can use pd.MultiIndex.from_frame

s = df.groupby(level=[0,1]).sum()
s.index = pd.MultiIndex.from_frame(s.index.to_frame().assign(status='total'))
out = df.append(s).sort_index()
print(out)

                          Count
state    car   status
texas    civic New           11
               damaged       10
               total         37
               totalled       5
               undamaged     11
virginia civic New           10
               damaged       10
               total         45
               totalled       5
               undamaged     20

However, .sort_index() will change the index order, you can try following instead

df_ = df['Count'].unstack()
df_['total'] = df_.sum(axis=1)
df_ = df_.stack().to_frame('Count')
# or in one line
df_ = (df['Count'].unstack()
       .pipe(lambda d: d.assign(total=d.sum(axis=1)))
       .stack().to_frame('Count'))
print(df_)

                          Count
state    car   status
texas    civic New           11
               damaged       10
               totalled       5
               undamaged     11
               total         37
virginia civic New           10
               damaged       10
               totalled       5
               undamaged     20
               total         45
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
0

An easy way I’ve implemented this in my workflow is to use the Sidetables package. Link

You can use it like: test.groupby(level=[0,1]).sum().stb.subtotal(sub_level=2) will accomplish what you’re looking for.

keg5038
  • 341
  • 3
  • 13