Find runs and lengths of consecutive values in an array

Question

I'd like to find equal values in an array and their indices if they occur consecutively more then 2 times.

[0, 3, 0, 1, 0, 1, 2, 1, 2, 2, 2, 2, 1, 3, 4]

so in this example I would find value "2" occured "4" times, starting from position "8". Is there any build in function to do that?

I found a way with collections.Counter

collections.Counter(a)
# Counter({0: 3, 1: 4, 3: 2, 5: 1, 4: 1})

but this is not what I am looking for. Of course I can write a loop and compare two values and then count them, but may be there is a more elegant solution?

The answers to [this question](https://stackoverflow.com/questions/24342047/count-consecutive-occurences-of-values-varying-in-length-in-a-numpy-array) may be what you are looking for. To convert your data into a boolean array you can use something similar to `a == 2` etc. — bluecouch, Apr 05 '22 at 05:13

Michael Szczesny · Accepted Answer · 2022-04-05T06:23:50.023

Find consecutive runs and length of runs with condition

import numpy as np

arr = np.array([0, 3, 0, 1, 0, 1, 2, 1, 2, 2, 2, 2, 1, 3, 4])

res = np.ones_like(arr)
np.bitwise_xor(arr[:-1], arr[1:], out=res[1:])  # set equal, consecutive elements to 0
# use this for np.floats instead
# arr = np.array([0, 3, 0, 1, 0, 1, 2, 1, 2.4, 2.4, 2.4, 2, 1, 3, 4, 4, 4, 5])
# res = np.hstack([True, ~np.isclose(arr[:-1], arr[1:])])
idxs = np.flatnonzero(res)                      # get indices of non zero elements
values = arr[idxs]
counts = np.diff(idxs, append=len(arr))         # difference between consecutive indices are the length

cond = counts > 2
values[cond], counts[cond], idxs[cond]

Output

(array([2]), array([4]), array([8]))
# (array([2.4, 4. ]), array([3, 3]), array([ 8, 14]))

what is if values in the initial array are not integers, but could be of type float? like arr = np.array([0, 3, 0, 1, 0, 1, 2, 1, 2.4, 2.4, 2.4, 2, 1, 3, 4, 4, 4, 5]) — user1908375, Apr 05 '22 at 05:57

Daniel F · Answer 2 · 2022-04-05T06:53:35.980

1

_, i, c = np.unique(np.r_[[0], ~np.isclose(arr[:-1], arr[1:])].cumsum(), 
                    return_index = 1, 
                    return_counts = 1)
for index, count in zip(i, c):
    if count > 1:
        print([arr[index], count, index])

Out[]:  [2, 4, 8]

A little more compact way of doing it that works for all input types.

edited Apr 05 '22 at 06:53

answered Apr 05 '22 at 06:34

Daniel F

13,620
2
29
55

Very nice solution, but `not equal` is not reliable for computed `floats`. – Michael Szczesny Apr 05 '22 at 06:49
1

@MichaelSzczesny True, let me just swipe that `~np.isclose(arr[:-1], arr[1:])` code snippet from you :P – Daniel F Apr 05 '22 at 06:54

Find runs and lengths of consecutive values in an array

2 Answers2