Hey I am struggling with a transformation of a DataFrame:
The initial frame has a format like this:
df=pd.DataFrame({'A':['A1','A1','A1','A1','A1','A2','A2','A2','A2','A3','A3','A3','A3'],
'B':['B1','B1','B1','B1','B2','B2','B2','B3','B3','B3','B4','B4','B4'],
'C':['C1','C1','C1','C2','C2','C3','C3','C4','C4','C5','C5','C6','C6'],
'X':['a','b','c','a','c','a','b','b','c','a','c','a','c'],
'Y':[1,4,4,2,4,1,4,3,1,2,3,4,5]})
A B C X Y
A1 B1 C1 a 1
A1 B1 C1 b 4
A1 B1 C1 c 4
A1 B1 C2 a 2
A1 B2 C2 c 4
A2 B2 C3 a 1
A2 B2 C3 b 4
A2 B3 C4 b 3
A2 B3 C4 c 1
A3 B3 C5 a 2
A3 B4 C5 c 3
A3 B4 C6 a 4
A3 B4 C6 c 5
I have some columns in the beginning where I want to apply groupby and then transpose the last two columns:
First df.groupby(['A','B','C','X']).sum()
Y
A B C X
A1 B1 C1 a 1
b 4
c 4
C2 a 2
B2 C2 c 4
A2 B2 C3 a 1
b 4
B3 C4 b 3
c 1
A3 B3 C5 a 2
B4 C5 c 3
C6 a 4
c 5
and then transpose the X/Y columns and add them horizontally.
A B C a b c
A1 B1 C1 1.0 4.0 4.0
A1 B1 C2 2.0 NaN NaN
A1 B2 C2 NaN NaN 4.0
A2 B2 C3 1.0 4.0 NaN
A2 B3 C4 NaN 3.0 1.0
A3 B3 C5 2.0 NaN NaN
A3 B4 C5 NaN NaN 3.0
A3 B4 C6 4.0 NaN 5.0
Not all groupby rows have all values so they need to be filled with something like np.nan. The question is linked to this one here but it is more complicated and I couldn't figure it out.