Problem
"How to pd.pivot_table with % (percentage) column where sum/ total is provided by df.pivot_table(margins=True)
"
Context
We have a sample pivot:
import pandas as pd
import numpy as np
df = pd.DataFrame([["row1",50, 200],["row2",100, 300]], columns=list('ABC'))
print(df)
print("\n\n")
pivot = df.pivot_table(
index=["A"],
columns=[],
values=["B", "C"],
aggfunc={
"B": np.sum,
"C": np.sum,
},
margins=True,
margins_name = "Total",
fill_value=0
)
print(pivot)
Like this:
B C
A
row1 50 200
row2 100 300
Total 150 500
Desired Output
B C D E
A
row1 50 200 250 38.46%
row2 100 300 400 61.54%
Total 150 500 650 100%
In words, we essentially wish to add column E (pct of row & column total)
to the output of pivot_table which is a percentage of the column total.
Note, to make the example a bit more readable we've added column 'D'
which should not be part of the actual output.
Beyond that, the output format must stay like this as eventually we will output it to an excel sheet for business usage.
Tried so far
Similar questions were asked:
Add percent of total column to Pandas pivot_table
- Only does it with a groupby but as we need the columns to stay intact when we print() so that didn't really work for me
- Only does it with a groupby but as we need the columns to stay intact when we print() so that didn't really work for me
Pandas percentage of total with groupby
- Didn't work for me as we need the output format to stay intact which I haven't figured out so far
- Didn't work for me as we need the output format to stay intact which I haven't figured out so far
Furthermore, I was hoping that pandas has maybe found a neat way around this with the latest version so we can do it using df.pivot_table
. They usually add some handy improvements to their iterations. :)
Specifications
Python: 3.5.2
Pandas: 0.18.1
Numpy: 1.11.1