Select column header names based on column value in pandas

Question

Hi I have the following data frame:

    col1    col2    col3    col4    col5
row1    0      1    0         0      0
row2    0      0    0         0      1

I want to create a data frame like

row1    col2
row2    col5

Actaully I want to select the column names whose value is 1.

Welcome to SO @sovan , try to post code what you have tried to do that — Naga kiran, Oct 07 '18 at 13:25
Can you assume that there is exactly one non-zero element in each row? — fuglede, Oct 07 '18 at 13:29
first column is of string type other columns are either 0 or 1 — Sovan, Oct 07 '18 at 13:37

fuglede · Answer 1 · 2018-10-07T13:44:43.560

0

One approach could be

df.idxmax(1)

With your given test data:

In [113]: df
Out[113]:
      col1  col2  col3  col4  col5
row1     0     1     0     0     0
row2     0     0     0     0     1

In [114]: df.idxmax(1)
Out[114]:
row1    col2
row2    col5
dtype: object

Based on what you mention comment below, if the column containing 'row1' and 'row2' is not already your index, you can use df.set_index first, then use idxmax as above:

In [120]: df
Out[120]:
  index  col1  col2  col3  col4  col5
0  row1     0     1     0     0     0
1  row2     0     0     0     0     1

In [121]: df.set_index('index').idxmax(1)
Out[121]:
index
row1    col2
row2    col5
dtype: object

edited Oct 07 '18 at 13:44

answered Oct 07 '18 at 13:30

fuglede

17,388
2
54
99

first column is of string type other columns are either 0 or 1.in this scenario its failing – Sovan Oct 07 '18 at 13:35
Then you can set that column to be your index using `df.set_index`, specifying the name of the column (i.e. what boils down to `df.set_index(df.columns[0])`). – fuglede Oct 07 '18 at 13:36

Select column header names based on column value in pandas

1 Answers1