0

Hi I have the following data frame:

    col1    col2    col3    col4    col5
row1    0      1    0         0      0
row2    0      0    0         0      1

I want to create a data frame like

row1    col2
row2    col5

Actaully I want to select the column names whose value is 1.

fuglede
  • 17,388
  • 2
  • 54
  • 99
Sovan
  • 1
  • 2

1 Answers1

0

One approach could be

df.idxmax(1)

With your given test data:

In [113]: df
Out[113]:
      col1  col2  col3  col4  col5
row1     0     1     0     0     0
row2     0     0     0     0     1

In [114]: df.idxmax(1)
Out[114]:
row1    col2
row2    col5
dtype: object

Based on what you mention comment below, if the column containing 'row1' and 'row2' is not already your index, you can use df.set_index first, then use idxmax as above:

In [120]: df
Out[120]:
  index  col1  col2  col3  col4  col5
0  row1     0     1     0     0     0
1  row2     0     0     0     0     1

In [121]: df.set_index('index').idxmax(1)
Out[121]:
index
row1    col2
row2    col5
dtype: object
fuglede
  • 17,388
  • 2
  • 54
  • 99
  • first column is of string type other columns are either 0 or 1.in this scenario its failing – Sovan Oct 07 '18 at 13:35
  • Then you can set that column to be your index using `df.set_index`, specifying the name of the column (i.e. what boils down to `df.set_index(df.columns[0])`). – fuglede Oct 07 '18 at 13:36