2

I have a dataframe with the following general layout:

id,ind_1,ind_2_ind_3
1,0,1,0
1,1,0,0
2,0,1,0
2,0,0,1
3,0,0,1
3,1,0,0

I would like to add an additional column whose values are the original indicator names when they are '1' which should look like this:

id,ind_1,ind_2,ind_3,ind_all
1,0,1,0,ind_2
1,1,0,0,ind_1
2,0,1,0,ind_2
2,0,0,1,ind_3
3,0,0,1,ind_3
3,1,0,0,ind_1

Any tips welcome!

Pylander
  • 1,531
  • 1
  • 17
  • 36

1 Answers1

8

You need

df['ind_all'] = (df.iloc[:, 1:] == 1).idxmax(1)


    id  ind_1   ind_2   ind_3   ind_all
0   1   0       1       0       ind_2
1   1   1       0       0       ind_1
2   2   0       1       0       ind_2
3   2   0       0       1       ind_3
4   3   0       0       1       ind_3
5   3   1       0       0       ind_1
Vaishali
  • 37,545
  • 5
  • 58
  • 86
  • 3
    What is the point of ==1 if we use idxmax? `df.set_index('id').idxmax(1).values` I think is more readable for instance. – Anton vBR Mar 19 '18 at 22:15