1

I have a Pandas Dataframe and wish to reverse the binary encoding (i.e. get_dummies()) of three columns. The encoding is left-to-right:

    a   b   c
0   0   1   1
1   0   0   1
2   1   1   1
3   1   0   0

would result in a new categories column C taking values 0-7:

    C
1   6   
2   4   
3   7
4   1

I am not sure why this line is giving me a syntax error, near axis=1:

df['C'] = df.apply(lambda x: (x['a']==1 ? 1:0)+(x['b']==1 ? 2:0)+(x['c']==1 ? 4:0), axis=1)
dr_rk
  • 4,395
  • 13
  • 48
  • 74

2 Answers2

2

Use numpy if performance is important - first convert DataFrame to numpy array and then use bitwise shift:

a = df.values
#pandas 0.24+
#a = df.to_numpy()
df['C'] = a.dot(1 << np.arange(a.shape[-1]))
print (df)
   a  b  c  C
0  0  1  1  6
1  0  0  1  4
2  1  1  1  7
3  1  0  0  1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

What you are doing is right. (just need some modifications in syntax)

I have modified you code,

>>> df['C'] = df.apply(lambda x: (1 if x['a']==1 else 0)+(2 if x['b']==1 else 0)+(4 if x['c']==1 else 0), axis=1)
shaik moeed
  • 5,300
  • 1
  • 18
  • 54