0

I have a DataFrame with multi-level columns, and I am not able to find a way to groupby the columns. Is there something for addressing the columns or should I go for the route of joining the names like in this question?

SOLUTION: I was addressing the columns as ['A','X'] instead of ('A','X')

Community
  • 1
  • 1
lib
  • 2,918
  • 3
  • 27
  • 53

2 Answers2

1
df.groupby([('depth1_column1', 'depth2_column1'), ('depth1_column1', 'depth2_column2')]).aggregate(sum)
su79eu7k
  • 7,031
  • 3
  • 34
  • 40
0

You can still use .groupby on columns. Below is a simple example.

import pandas as pd
import numpy as np

# data
# ==========================================
np.random.seed(0)
multi_col = pd.MultiIndex.from_product([['A', 'B'], ['X', 'Y']], names=['ab', 'xy'])
df = pd.DataFrame(np.random.randint(1,5, (10,4)), columns=multi_col)
df

ab  A     B   
xy  X  Y  X  Y
0   1  4  2  1
1   4  4  4  4
2   2  4  2  3
3   1  4  3  1
4   1  1  3  2
5   3  4  4  3
6   1  2  2  2
7   2  1  2  1
8   4  1  4  2
9   3  4  4  1

# groupby one column
# ===================================
for g_name, g in df.groupby([('A', 'X')]):
    print(g_name)
    print(g)


1
ab  A     B   
xy  X  Y  X  Y
0   1  4  2  1
3   1  4  3  1
4   1  1  3  2
6   1  2  2  2
2
ab  A     B   
xy  X  Y  X  Y
2   2  4  2  3
7   2  1  2  1
3
ab  A     B   
xy  X  Y  X  Y
5   3  4  4  3
9   3  4  4  1
4
ab  A     B   
xy  X  Y  X  Y
1   4  4  4  4
8   4  1  4  2    


# groupby two columns
# ===================================
for g_name, g in df.groupby([('A','X'), ('A','Y')]):
    print(g_name)
    print(g)

(1, 1)
ab  A     B   
xy  X  Y  X  Y
4   1  1  3  2
(1, 2)
ab  A     B   
xy  X  Y  X  Y
6   1  2  2  2
(1, 4)
ab  A     B   
xy  X  Y  X  Y
0   1  4  2  1
3   1  4  3  1
(2, 1)
ab  A     B   
xy  X  Y  X  Y
7   2  1  2  1
(2, 4)
ab  A     B   
xy  X  Y  X  Y
2   2  4  2  3
(3, 4)
ab  A     B   
xy  X  Y  X  Y
5   3  4  4  3
9   3  4  4  1
(4, 1)
ab  A     B   
xy  X  Y  X  Y
8   4  1  4  2
(4, 4)
ab  A     B   
xy  X  Y  X  Y
1   4  4  4  4
Jianxun Li
  • 24,004
  • 10
  • 58
  • 76