3

My dataframe looks like this after converting categorical to numerical using get_dummies()

score1 score2  country_CN country _AU category_leader category_
0.89.   0.45.   0.         1.            0              1
0.55.   0.54     1.        0             1              0

As you can see the converted categorical to numerical columns are country_CN country _AU category_leader category_

I want to bring it to its's original dataframe something like this:

score1 score2  country category_leader 
0.89.   0.45.   AU                    
0.55.   0.54    CN            leader    

I have tried using the suggestion listed here: Reverse a get_dummies encoding in pandas

But no luck as of yet.

Any help/ clue?

Jazz
  • 445
  • 2
  • 7
  • 22

1 Answers1

1

You can convert for dummies columns to index first by DataFrame.set_index:

#https://stackoverflow.com/a/62085741/2901002
df = undummify(df.set_index(['score1','score2'])).reset_index()

Or use alternative solution with DataFrame.melt, fiter rows with boolean indexing, splitting by Series.str.split and last pivoting by DataFrame.pivot:

df1 = df.melt(['score1','score2'])
df1 = df1[df1['value'].eq(1)]
df1[['a','b']] = df1.pop('variable').str.split('_', expand=True)
df1 = df1.pivot(index=['score1','score2'], columns='a', values='b').reset_index()
print (df1)
a  score1  score2 category country
0    0.55    0.54   leader      CN
1    0.89    0.45               AU
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252