1

Input dataframe is as follows:

import pandas as pd

df_input = pd.DataFrame([[1,0,0,0,0], [0,1,0,0,0], [0,0,1,0,0],[0,0,0,1,0],[0,0,0,0,1]], columns=["A", "B","C","D","E"])

For all the rows, only one column has non zero entry. Accordingly, the output dataframe I am expecting is as follows:-

df_output=pd.DataFrame(['A','B','C','D','E'],columns=['Alphabet'])

The alphabet column should have the column name of non zero value. Please suggest

Abhishek Kulkarni
  • 654
  • 1
  • 8
  • 20

2 Answers2

2

You can use a dot product after getting a boolean df checking for not equals 0.

If there are chances of multiple columns having a non zero value for a row and you want to get the first:

df_input.ne(0).dot(df_input.columns+',').str.split(",").str[0].to_frame('Alphabet')

If there will be only 1 column with a non zero value always , then we can use rstrip like @Shubham mentioned in their coments.

df_input.ne(0).dot(df_input.columns+',').str.rstrip(',').to_frame('Alphabet')

  Alphabet
0        A
1        B
2        C
3        D
4        E
anky
  • 74,114
  • 11
  • 41
  • 70
1

You can use idxmax along axis=1:

df_input.ne(0).idxmax(1).to_frame('Alphabet')

  Alphabet
0        A
1        B
2        C
3        D
4        E
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53