I would like to reverse the get_dummies encoding, but with multiple sub categories ("A","B" in this example):
df = pd.DataFrame({'A': ['a', 'b', 'a'], 'B': ['b', 'a', 'c'],'C': [1, 2, 3]})
A B C
0 a b 1
1 b a 2
2 a c 3
df = pd.get_dummies(df)
C A_a A_b B_a B_b B_c
0 1 1 0 0 1 0
1 2 0 1 1 0 0
2 3 1 0 0 0 1
Now when inverting the "dummy" df the result should separate the two categories "A" and "B" without stacking to one category like:
C Col2
0 1 B_b
1 2 A_b
2 3 B_c
I've tried:
df[df==1].stack().reset_index().drop(0,1)
level_0 level_1
0 0 C
1 0 A_a
2 0 B_b
3 1 A_b
4 1 B_a
5 2 A_a
6 2 B_c
df.idxmax(axis=1)
0 C
1 C
2 C
dtype: object
v = np.argwhere(df.drop('C', 1).values).T
t=pd.DataFrame({'C' : df.loc[v[0], 'C'], 'Col2' : df.columns[1:][v[1]]})
t
C Col2
0 1 B_b
1 2 A_b
2 3 B_c
df2 = df.select_dtypes(include = ['object'])
df2[df2.columns].apply(lambda x:x.astype('category'))
the reverse should give the original again:
A B C
0 a b 1
1 b a 2
2 a c 3
Thank you for your help in advance!