I have this dataframe, data.
data = pd.DataFrame({'group':['A', 'A', 'B', 'C', 'C', 'B'],
'value':[0.2, 0.21, 0.54, 0.02, 0.001, 0.19]})
I want to build three new features. Below is my target output.
pd.DataFrame({'group':['A', 'A', 'B', 'C', 'C', 'B'],
'value':[0.2, 0.21, 0.54, 0.02, 0.001, 0.19],
'group_A':[0.2, 0.21, 0,0,0,0],
'group_B':[0,0,0.54, 0, 0, 0.19],
'group_C':[0,0,0,0.02, 0.001,0]})
What is the most efficient way to perform such a task? The code below solves the problem. But perhaps there is a vectorized way to do it on my very large real world data set?
for g in data.group.unique():
tmp= [0 if j==g else i for i, j in zip(data.value, data.group)]
data['group_{}'.format(g)]=tmp