Get index number when condition is true in 3 columns

Question

I've a question regarding some code in python. I'm trying to extract the index of the first row when the condition TRUE is satisfied in 3 different columns. This is the data I'm using:

       0       1       2             3           4
0   TRUE    TRUE    TRUE    0.41871395  0.492517879
1   TRUE    TRUE    TRUE    0.409863582 0.519425031
2   TRUE    TRUE    TRUE    0.390077415 0.593127232
3   FALSE   FALSE   FALSE   0.372020631 0.704367199
4   FALSE   FALSE   FALSE   0.373546556 0.810876797
5   FALSE   FALSE   FALSE   0.398876919 0.86855678
6   FALSE   FALSE   FALSE   0.432142094 0.875576037
7   FALSE   FALSE   FALSE   0.454115421 0.863063448
8   FALSE   TRUE    FALSE   0.460676901 0.855739006
9   FALSE   TRUE    FALSE   0.458693197 0.855128636
10  FALSE   FALSE   FALSE   0.459201839 0.856451104
11  FALSE   FALSE   FALSE   0.458693197 0.855739006
12  FALSE   FALSE   FALSE   0.458082827 0.856349376
13  FALSE   FALSE   FALSE   0.456556902 0.856959746
14  TRUE    TRUE    TRUE    0.455946532 0.858180486
15  TRUE    TRUE    TRUE    0.455030976 0.858790857
16  TRUE    TRUE    TRUE    0.454725791 0.858485672
17  FALSE   FALSE   FALSE   0.454420606 0.857875301
18  FALSE   FALSE   FALSE   0.454725791 0.858383943
19  FALSE   TRUE    FALSE   0.453199866 0.856654561
20  FALSE   FALSE   FALSE   0.451979125 0.856349376
21  FALSE   FALSE   FALSE   0.45167394  0.856959746
22  FALSE   FALSE   FALSE   0.451775669 0.857570116
23  FALSE   FALSE   FALSE   0.45106357  0.857264931
24  TRUE    TRUE    TRUE    0.450758385 0.856654561
25  TRUE    TRUE    TRUE    0.4504532   0.856044191
26  TRUE    TRUE    TRUE    0.449232459 0.856349376
27  TRUE    TRUE    TRUE    0.448316904 0.855535549

and I need to get the index number only when there are 3 'True' conditions:

0   
14  
24

Thank you!

Are `TRUE` and `FALSE` string or booleans here ? Because in python a boolean only has the first letter capital -> `True & False` — Benoit Drogou, Aug 21 '19 at 15:51

wbadart · Answer 1 · 2019-08-21T19:03:43.540

EDIT 2: I have a new answer in response to some clarifications.

You're looking for each row that has TRUE in columns 0, 1, or 2, BUT you'd like to ignore such rows that are not the first in a streak of them. The first part of my answer is still the same, I think you should create a mask that selects your TRUE triplet rows:

condition = df[[0, 1, 2]].all(axis='columns')

But now I present a possible way to filter out the rows you want to ignore. To be not-first in a streak of TRUE triplet rows means that the previous row also satisfies condition.

idx = df[condition].index
ignore = idx.isin(idx + 1)
result = idx[~ignore]

In other words, ignore rows where the index value is the successor of an index value satisfying condition.

Hope this helps!

Keeping my original answer for record keeping:

I think you'll end up with the most readable solution by breaking this out into two steps:

First, find out which rows have the value True for all of the columns you're interested in:

condition = df[[0, 1, 2]].all(axis='columns')

Then, the index values you're interested in are simply df[condition].index.

EDIT: if, as Benoit points out may be the case, TRUE and FALSE are strings, that's fine, you just need a minor tweak to the first step:

condition = (df[[0, 1, 2]] == 'TRUE').all(axis='columns')

I think I, and the other answerers, are confused because your explanation doesn't line up with your expected answer. You say "I'm trying to extract the index of the first row when the condition TRUE is satisfied in 3 different columns", in which case you could take the `head` of `df[condition]` for your answer. I see now that what you *really* want is the first row of each streak of `TRUE` triplet rows. I'll workshop my response accordingly. — wbadart, Aug 21 '19 at 17:45

score 1 · Accepted Answer · answered Aug 21 '19 at 16:52

I guess everyone missed the "extract the index of the first row" part. One of the way would be removing consecutive duplicates first and then obtaining index where all three is True so that you only get first row of the truth

df=df[['0', '1', '2']]
df=df[df.shift()!=df].dropna().all(axis=1)
print(df[df].index.tolist())

OUTPUT:

[0, 14, 24]

score 1 · Answer 3 · answered Aug 21 '19 at 18:32

1

I tried this on a demo dataframe and it seems to work for me.



df = pd.DataFrame(data={'A':[True,True,True,True,True,False,True,True],'B':[True,True,False,True,True,False,True,True],'C':[True,False,True,True,True,False,True,True]})
i =df[(df['A']==True) & (df['B']==True) & (df['C']==True)].index.to_list()

i = [x for x in i if x-1 not in i]

answered Aug 21 '19 at 18:32

Parijat Bhatt

664
4
6

You could replace the column names with the names you have in your dataframe – Parijat Bhatt Aug 21 '19 at 18:34

score 1 · Answer 4 · answered Aug 21 '19 at 20:38

If the TRUE and FALSE in your DataFrame are actually the boolean values True and False then,

#This will look at the first 3 columns and return True if "all" are True else it will return False:

step1 = [all(q) for q in df[[0,1,2]].values]

id = []
cnt = 0
temp_cnt = 0

#this loop finds where the value is true and checks if the next 2 are also true 
#it then appends the count-2 to a list named id, the -2 compensates for the index.
for q in step1:
    if q:
        cnt += 1
        if cnt == 3:
            id.append(temp_cnt - 2)
    else:
        cnt = 0

    temp_cnt += 1


#Then when printing "id" it will return the first index where AT LEAST 3 True values occur in sequence.
id
Out[108]: [0, 14, 24]

ats · Answer 5 · 2019-08-25T13:52:21.817

1

I think this could do the trick. As a general advice though, it always helps to name the columns in pandas.

Say that your pandas data frame is named data:

data[(data[0] == True) & (data[1] == True) & (data[2] == True)].index.values

or

list(data[(data[0] == True) & (data[1] == True) & (data[2] == True)].index.values)

edited Aug 25 '19 at 13:52

answered Aug 25 '19 at 13:42

ats

141
6

Ted · Answer 6 · 2019-08-22T07:20:59.330

0

The following will work regardless of the position of the 3 columns you wish to check for True values, and gives you back a list indicating which rows have 3 True values present:

Edit:

Now updated to better align with the OP's original request:

#df.iloc[:,:3] = df.iloc[:,:3].apply(lambda x: str(x) == "TRUE")    # If necessary
s = (df == True).apply(sum, axis=1) == 3
s = s[s.shift() != s]
s.index[s].tolist()

edited Aug 22 '19 at 07:20

answered Aug 21 '19 at 15:49

Ted

1,189
8
15

There are some more elegant answers here, certainly, but any downvoters care to comment? – Ted Aug 21 '19 at 16:04
What if there are other boolean columns? I believe your answer is way too coupled to the specific given dataframe – Itay Aug 21 '19 at 16:26
@Itay OP's original request was "when the condition TRUE is satisfied in 3 different columns." Although one could infer from the given data that the first 3 columns were what was meant, it is equally likely that the question was a general one about a condition being `True` in any 3 columns per row in a given dataframe. – Ted Aug 21 '19 at 16:29
1

Okay, you convinced me. – Itay Aug 21 '19 at 16:31
This does not generate the expected output : `[0, 14, 24]` – Ricky Kim Aug 21 '19 at 16:57
I agree with @wbadart that the original explanation didn't really line up with the expected output. I've updated my answer accordingly. – Ted Aug 22 '19 at 07:17

score 0 · Answer 7 · answered Aug 21 '19 at 15:54

0

Based on the answer here, something like this will provide a list of indices for the rows that meet all conditions:

df[(df[0]==True) & (df[1]==True) & (df[2]==True)].index.tolist()

answered Aug 21 '19 at 15:54

Korean_Of_the_Mountain

1,428
3
16
40

This does not generate the expected output : `[0, 14, 24]` – Ricky Kim Aug 21 '19 at 16:57
@RickyKim has the right answer. I totally skipped over the part where OP only wanted the index of first group of rows that met conditions. I tried to think of a way without using `shift()` to be more original, but couldn't come up with anything outside of inefficiently looping over each row. – Korean_Of_the_Mountain Aug 21 '19 at 18:03

Get index number when condition is true in 3 columns

7 Answers7