Count consecutive occurrences of True value in two dimentional numpy array (matrix) of Booleans

Question

I am new to Python, and now I am encountered with this question to count occurrences of consecutive True values in nested list or two dimensional numpy array filled with Booleans. Say I have a nested list like listX = [[True, False, True, True, True], [False, True, True, False, True], [False, True, False, False, False], [True, True, False, False, True]]. I want to count the occurrences of consecutive True values in each list, i.e. for listX[0], I would want the answer to be [1,3]. (In reality, I can have 10-25 flexible number of lists inside the nested list and each list contains 100 Boolean values.) Based on the itertools mentioned in the answer for a previous question with one dimensional array Count consecutive occurences of values varying in length in a numpy array, I can answer my simple example like this:

listX = [[True, False, True, True, True], [False, True, True, False, True], [False, True, False, False, False], [True, True, False, False, True]]

import numpy as np

arr = np.array(listX)
arr
>>> array([[ True, False,  True,  True,  True],
       [False,  True,  True, False,  True],
       [False,  True, False, False, False],
       [ True,  True, False, False,  True]])

import itertools

c1 = [sum(1 for _ in group) for key, group in itertools.groupby(arr[0]) if key]
c2 = [sum(1 for _ in group) for key, group in itertools.groupby(arr[1]) if key]
c3 = [sum(1 for _ in group) for key, group in itertools.groupby(arr[2]) if key]
c4 = [sum(1 for _ in group) for key, group in itertools.groupby(arr[3]) if key]

c1, c2, c3, c4
>>> ([1, 3], [2, 1], [1], [2, 1])

Since the example here just have 4 rows, I can code this way with indexing each row for 2D array, but in reality, I can have 10-25 flexible number of rows and each row contains 100 Boolean values. Is there any simpler way than this?

The question you cited is, indeed, a duplicate of what you're asking. All you need to do is apply that answer's logic to each row of your nested list. Since you haven't explained where you're having trouble with that technique, please repeat [how to ask](https://stackoverflow.com/help/how-to-ask) and [MRE](https://stackoverflow.com/help/minimal-reproducible-example) from the [intro tour](https://stackoverflow.com/tour). — Prune, May 16 '20 at 01:47
You can't do this in 2D in numpy in the general case because the rows may have different runs. You can count the runs by summing `np.diff` of the correct axis if the array is boolean. You'll need to find the right divisor and handle edge cases too of course — Mad Physicist, May 16 '20 at 03:22
Thanks @MadPhysicist, this is helpful to know. I will play around. — Qiaoling Cui, May 16 '20 at 03:34
I suggest you delete the question until you know what you're asking. It's easier to fix and undelete a question that's deleted voluntarily than to reopen one closed by the community — Mad Physicist, May 16 '20 at 03:48
@Prune, the reality is that I have a flexible number of rows, sometimes can have 10, up to 25, so it doesn't sound a smart way to repeat the same code 10-25 times just for one array — Qiaoling Cui, May 16 '20 at 03:52

Valdi_Bo · Accepted Answer · 2020-05-16T08:58:18.263

Convert your code applied to each row to the following lambda function:

myCount = lambda ar: [sum(1 for _ in group) for key, group in itertools.groupby(ar) if key]

Then assemble results for each row the following way:

res = []
for i in range(arr.shape[0]):
    res.append(myCount(arr[i]))

To test also other cases, I extended your sample data with a row full of False values and another row full of True:

array([[ True, False,  True,  True,  True],
       [False,  True,  True, False,  True],
       [False,  True, False, False, False],
       [ True,  True, False, False,  True],
       [False, False, False, False, False],
       [ True,  True,  True,  True,  True]])

The result for the above array is:

[[1, 3], [2, 1], [1], [2, 1], [], [5]]

I think, this result should be left as a pythonic nested list. The reason is that Numpy does not support "jagged" arrays (with rows of different length).

Count consecutive occurrences of True value in two dimentional numpy array (matrix) of Booleans

1 Answers1