Counting the amount of times a boolean goes from True to False in a column

Question

I have a column in a dataframe which is filled with booleans and i want to count how many times it changes from True to False.

I can do this when I convert the booleans to 1's and 0's ,then use df.diff and then divide that answer by 2

import pandas as pd

d = {'Col1': [True, True, True, False, False, False, True, True, True, True, False, False, False, True, True, False, False, True, ]}


df = pd.DataFrame(data=d)


print(df)

0    True
1    True
2    True
3   False
4   False
5   False
6    True
7    True
8    True
9    True
10  False
11  False
12  False
13   True
14   True
15  False
16  False

My expected outcome would be The amount of times False came up is 3

This is a dupe. Looking for the the best link. In the meantime, try [this](https://stackoverflow.com/questions/45024200/counting-changes-of-value-in-each-column-in-a-data-frame-in-pandas) or [this](https://stackoverflow.com/questions/53542668/count-appearances-of-a-value-until-it-changes-to-another-value) or [this](https://stackoverflow.com/questions/30196063/determining-when-a-column-value-changes-in-pandas-dataframe) or [this](https://stackoverflow.com/questions/53189792/how-to-count-the-number-of-state-change-in-pandas). — pault, Jan 16 '19 at 15:08

yatu · Accepted Answer · 2019-01-16T23:15:55.420

7

You can perform a bitwise and of the Col1 with a mask indicating where changes occur in successive rows:

(df.Col1 & (df.Col1 != df.Col1.shift(1))).sum()
3

Where the mask, is obtained by comparing Col1 with a shifted version of itself (pd.shift):

df.Col1 != df.Col1.shift(1)

0      True
1     False
2     False
3      True
4     False
5     False
6      True
7     False
8     False
9     False
10     True
11    False
12    False
13     True
14    False
15    False
16    False
17    False
Name: Col1, dtype: bool

For multiple columns, you can do exactly the same (Here I tested with a col2 identical to col1)

(df & (df != df.shift(1))).sum()

Col1    3
Col2    3
dtype: int64

edited Jan 16 '19 at 23:15

answered Jan 16 '19 at 15:07

yatu

86,083
12
84
139

Is there a way to apply this to the whole dataframe? (i should've said i had multiple columns like this) – Martijn van Amsterdam Jan 16 '19 at 15:17
Won't this also capture when `False` changes to `True` at the start of a series? Just checking since OP has specified `True` to `False` only. – jpp Jan 16 '19 at 15:25
1

No, because its doing a logical and with the series. So the only remaining `True`s will be those that were already `True` in `df` – yatu Jan 16 '19 at 15:26

score 4 · Answer 2 · answered Jan 16 '19 at 15:19

Notice that subtracting True (1) from False (0) in integer terms gives -1:

res = df['Col1'].astype(int).diff().eq(-1).sum()  # 3

To apply across a Boolean dataframe, you can construct a series mapping label to count:

res = df.astype(int).diff().eq(-1).sum()

score 2 · Answer 3 · answered Jan 16 '19 at 15:16

2

Just provide different idea

df.cumsum()[~df.Col1].nunique()
Out[408]: 
Col1    3
dtype: int64

answered Jan 16 '19 at 15:16

BENY

317,841
20
164
234

score 1 · Answer 4 · answered Jan 16 '19 at 15:25

My strategy was to find where the difference in one row to the next. (Considering that Trues are 1's and Falses are 0's, of course.)

Thus, Colm1 - Colm1.shift() represents the Delta value where a 1 is a shift from False to True, 0 No Change, and -1 shift from True to False.

import pandas as pd

d = {'Col1': [True, True, True, False, False, False, True, True, True, True, False, False, False, True, True, False, False, True, ]}

df = pd.DataFrame(data=d)
df['delta'] = df['Col1'] - df['Col1'].shift()
BooleanShifts = df['delta'].value_counts()
print(BooleanShifts[-1])

After getting the value counts as a dict of these [1, 0, -1] values, you can select for just the -1's and get the number of times the DF shifted to a False Value from a True Value. I hope this helped answer your question!

score 1 · Answer 5 · answered Jan 16 '19 at 15:57

1

Less concise but perhaps a more readable approach would be:

count = 0
for item in zip(d['Col1'], d['Col1'][1:]):
    if item == (True, False):
        count += 1
print(count)

answered Jan 16 '19 at 15:57

alec_djinn

10,104
8
46
71

Isn't looping over a dataframe considered bad practice? – Martijn van Amsterdam Jan 16 '19 at 17:44
It could be slower perhaps, but I dont see what could go wrong with it. Why would you say that it is bad practice? – alec_djinn Jan 17 '19 at 06:48
I am by no means an expert in using pandas, but when I was looking for a solution to one of my problems most comments and answers suggested that you shouldn't use a for loop to do stuff. Cause like you said it's slower, I recently figured out how to do something without using a for loop and that part is about 3 times faster now, which doesnt really matter for small datasets – Martijn van Amsterdam Jan 17 '19 at 07:33
Indeed. If the dataframe is not huge and speed is not a concern then this solution will work just fine. Personally I prefer readibility to speed. That loop can be easily understood by anyone, even without experience with Pandas. – alec_djinn Jan 17 '19 at 10:04

Counting the amount of times a boolean goes from True to False in a column

5 Answers5

Linked