I have a pandas dataframe like where the first four columns form a multiindex:
import pandas as pd
data = [[1, 'A', 1, 0, 10],
[1, 'A', 0, 1, 10],
[1, 'A', 1, 1, 10],
[1, 'A', 0, 0, 10],
[1, 'B', 1, 0, 10],
[1, 'B', 0, 1, 10],
[1, 'B', 1, 1, 10],
[1, 'B', 0, 0, 10]]
cols = ['user_id','type','flag1','flag2','cnt']
df = pd.DataFrame(data,columns = cols)
df = df.set_index(['user_id','type','flag1','flag2'])
print df
user_id type flag1 flag2 cnt
________________________________________
1 A 1 0 10
1 A 0 1 10
1 A 1 1 10
1 A 0 0 10
1 B 1 0 10
1 B 0 1 10
1 B 1 1 10
1 B 0 0 10
I'd like to iterate over the index values to get the grouped total count for each unique index values like so:
user_id type flag1 flag2 cnt
________________________________________
1 ALL ALL ALL 80
1 ALL ALL 0 40
1 ALL ALL 1 40
1 ALL 1 ALL 40
1 ALL 0 ALL 40
1 A ALL ALL 40
1 B ALL ALL 40
1 A ALL 0 20
1 A ALL 1 20
1 B ALL 0 20
1 B ALL 1 20
1 A 1 ALL 20
1 A 0 ALL 20
1 B 1 ALL 20
1 B 0 ALL 20
1 A 1 0 10
1 A 0 1 10
1 A 1 1 10
1 A 0 0 10
1 B 1 0 10
1 B 0 1 10
1 B 1 1 10
1 B 0 0 10
I'm able to generate each group easily using query and groupby, but ideally I'd like to be able to iterate over any number of index columns to get the sum of the cnt column.