first of all, this post is very useful: How to pivot a dataframe
Right now I have the following target:
df = pd.DataFrame({"A": ["foo", "foo", "foo", "foo", "foo",
... "bar", "bar", "bar", "bar"],
... "B": ["one", "one", "one", "two", "two",
... "one", "one", "two", "two"],
... "C": ["small", "large", "large", "small",
... "small", "large", "small", "small",
... "large"],
... "D": [1, 2, 2, 3, 3, 4, 5, 6, 7],
... "E": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
table = pd.pivot_table(df, values=['D', 'E'], index=['A', 'C'],
... aggfunc={'D': np.mean,
... 'E': ['count', max, np.mean]})
flattened = pd.DataFrame(table.to_records())
with the outcome and target:
A C ('D', 'mean') ('E', 'count') ('E', 'max') ('E', 'mean')
0 bar large 5.500000 2.0 9.0 7.500000
1 bar small 5.500000 2.0 9.0 8.500000
2 foo large 2.000000 2.0 5.0 4.500000
3 foo small 2.333333 3.0 6.0 4.333333
Is there any aquivalent command with groupby? Something like: df.groupby(['row', 'col'])['val0'].agg(['size', 'mean']).unstack(fill_value=0)
Or this already the most efficient way to do it?