Pandas DF. For each row, return the column name of the True value

Question

I have a pandas df with multiple rows, 5k+ and approximately 10 columns True/False. In each of the rows, only one of the column's entries will be True and the remaining 9 false.

# Import library
import pandas as pd

# Create dictionary and convert to pd DF
test = {"col1":[True, False, True, True, False],
        "col2":[False, True, False, False, True]}

test = pd.DataFrame(test)

# Show case a dataframe
print(test)

The dataframe should look like

    col1    col2
0   True    False
1   False   True
2   True    False
3   True    False
4   False   True|

I am hoping to return an array with the following values:

output_array = ['col1','col2','col1','col1','col2']

I'm stuck and I know I should probably use some sort of apply method and index the 10 columns, but I am not sure on the best way to screen the subset of elements of a row for True and return the column. Any help much appreciated and thank you!

`test.dot(test.columns)` - https://stackoverflow.com/a/60472541/6075699 — Dishin H Goyani, Jun 29 '21 at 03:55
@DishinHGoyani interesting, i tried it and it works, but why does it work? what does it mean to take the dot product with the `columns`? — tdy, Jun 29 '21 at 04:03
@tdy I guess it is doing inner product of two vectors like for first row it would be `np.array([1,0]) @ np.array(["col1","col2"], dtype="O")` — Dishin H Goyani, Jun 29 '21 at 09:28
but If there is multiple `True` values then it will give like "col1col2" — Dishin H Goyani, Jun 29 '21 at 09:31

score 2 · Answer 1 · edited Jun 29 '21 at 03:52

2

true_col_name = test.idxmax(axis=1)

will give you a Series of which column name has the True value, assuming that there is in fact exactly one True value per row.

In [6]: test.idxmax(axis=1)
Out[6]: 
0    col1
1    col2
2    col1
3    col1
4    col2
dtype: object

edited Jun 29 '21 at 03:52

Ferris

5,325
1
14
23

answered Jun 29 '21 at 03:23

Steele Farnsworth

863
1
6
15

note that it will work even if there are multiple `True` values. the tie breaker will just go to the leftmost `True`. – tdy Jun 29 '21 at 03:25
how can I get all the columns (multiple some times) with True values? – user77005 Aug 06 '23 at 06:56

Pandas DF. For each row, return the column name of the True value

1 Answers1