Just tested on Pandas 0.17.0 and Python 2.7.5 and I can now understand why you asked me issues with reindex(axis=1) and the '*' before df1.columns.levels[1]
. It was indeed version problems from both Pandas and Python. I modified the code to run through the older versions mentioned above and also fixed one potential bug in case multiple common descriptive stats need to be post-calculated in the same Pivot table. Going forward, it would be easier to mention the versions of software (if they are older versions) in your future posts, so there will be less misunderstanding:
import pandas as pd
str = """card auth month order_number
Amex A 2017-11 1234
Visa A 2017-12 2345
Amex D 2017-12 3416
MC A 2017-12 3426
Visa A 2017-11 3436
Amex D 2017-12 3446
Visa A 2017-11 3466
Amex D 2017-12 3476
Visa D 2017-11 3486
"""
# create dataframe from the above sample data
df = pd.read_table(pd.io.common.StringIO(str), sep='\s+')
# create the pivot_table using the method OP supplied
df1 = df.pivot_table(index='card', columns=['month', 'auth'], values='order_number', aggfunc='count')
print(df1)
# month 2017-11 2017-12
# auth A D A D
# card
# Amex 1.0 NaN NaN 3.0
# MC NaN NaN 1.0 NaN
# Visa 2.0 1.0 1.0 NaN
# create an empty dataframe with the same index/column layout as df1
# except the level-1 in columns
idx = pd.MultiIndex.from_product([df1.columns.levels[0], ['total', 'avg', 'std', 'pct']], names=df1.columns.names)
df2 = pd.DataFrame(columns=idx, index=df1.index).sort_index(axis=1)
print(df2)
# month 2017-11 2017-12
# auth avg pct std total avg pct std total
# card
# Amex NaN NaN NaN NaN NaN NaN NaN NaN
# MC NaN NaN NaN NaN NaN NaN NaN NaN
# Visa NaN NaN NaN NaN NaN NaN NaN NaN
# Calculate the common stats:
df2.loc[:,(slice(None),'total')] = df1.groupby(level=0, axis=1).sum().values
df2.loc[:,(slice(None),'avg')] = df1.groupby(level=0, axis=1).mean().values
df2.loc[:,(slice(None),'std')] = df1.groupby(level=0, axis=1).std().values
# join df2 with df1 and assign the result to df3 (can also overwrite df1):
df3 = df1.join(df2).sort_index(axis=1)
# calculate `pct` which needs both a calculated field and an original field
# auth-rate = A / total
df3.loc[:,(slice(None),'pct')] = df3.groupby(level=0, axis=1)\
.apply(lambda x: x.loc[:,(slice(None),'A')].values/x.loc[:,(slice(None),'total')].values) \
.values
print(df3)
# month 2017-11 2017-12
# auth A D avg pct std total A D avg pct std total
# card
# Amex 1 NaN 1.0 1.000000 NaN 1 NaN 3 3 NaN NaN 3
# MC NaN NaN NaN NaN NaN NaN 1 NaN 1 1 NaN 1
# Visa 2 1 1.5 0.666667 0.707107 3 1 NaN 1 1 NaN 1
# rounding if needed:
df3.loc[:,(slice(None),'pct')] = df3.loc[:,(slice(None),'pct')].round(decimals=2)
If you want to sort the level-1 columns in a specific order, you can do the reindex().
# create a ordered list of level-1 on columns
column_level_1 = list(df1.columns.levels[1]) + ['total', 'avg', 'std', 'pct']
# create MultiIndex for columns and reindex_axis accordingly
midx = pd.MultiIndex.from_product([df1.columns.levels[0], column_level_1], names=df1.columns.names)
df3 = df3.reindex_axis(midx, axis=1)
print(df3)
# month 2017-11 2017-12
# auth A D total avg std pct A D total avg std pct
# card
# Amex 1 NaN 1 1.0 NaN 1.000000 NaN 3 3 3 NaN NaN
# MC NaN NaN NaN NaN NaN NaN 1 NaN 1 1 NaN 1
# Visa 2 1 3 1.5 0.707107 0.666667 1 NaN 1 1 NaN 1