1

I'd like to find equal values in an array and their indices if they occur consecutively more then 2 times.

[0, 3, 0, 1, 0, 1, 2, 1, 2, 2, 2, 2, 1, 3, 4]

so in this example I would find value "2" occured "4" times, starting from position "8". Is there any build in function to do that?

I found a way with collections.Counter

collections.Counter(a)
# Counter({0: 3, 1: 4, 3: 2, 5: 1, 4: 1})

but this is not what I am looking for. Of course I can write a loop and compare two values and then count them, but may be there is a more elegant solution?

Michael Szczesny
  • 4,911
  • 5
  • 15
  • 32
user1908375
  • 1,069
  • 1
  • 14
  • 33
  • The answers to [this question](https://stackoverflow.com/questions/24342047/count-consecutive-occurences-of-values-varying-in-length-in-a-numpy-array) may be what you are looking for. To convert your data into a boolean array you can use something similar to `a == 2` etc. – bluecouch Apr 05 '22 at 05:13

2 Answers2

7

Find consecutive runs and length of runs with condition

import numpy as np

arr = np.array([0, 3, 0, 1, 0, 1, 2, 1, 2, 2, 2, 2, 1, 3, 4])

res = np.ones_like(arr)
np.bitwise_xor(arr[:-1], arr[1:], out=res[1:])  # set equal, consecutive elements to 0
# use this for np.floats instead
# arr = np.array([0, 3, 0, 1, 0, 1, 2, 1, 2.4, 2.4, 2.4, 2, 1, 3, 4, 4, 4, 5])
# res = np.hstack([True, ~np.isclose(arr[:-1], arr[1:])])
idxs = np.flatnonzero(res)                      # get indices of non zero elements
values = arr[idxs]
counts = np.diff(idxs, append=len(arr))         # difference between consecutive indices are the length

cond = counts > 2
values[cond], counts[cond], idxs[cond]

Output

(array([2]), array([4]), array([8]))
# (array([2.4, 4. ]), array([3, 3]), array([ 8, 14]))
Michael Szczesny
  • 4,911
  • 5
  • 15
  • 32
  • what is if values in the initial array are not integers, but could be of type float? like arr = np.array([0, 3, 0, 1, 0, 1, 2, 1, 2.4, 2.4, 2.4, 2, 1, 3, 4, 4, 4, 5]) – user1908375 Apr 05 '22 at 05:57
1
_, i, c = np.unique(np.r_[[0], ~np.isclose(arr[:-1], arr[1:])].cumsum(), 
                    return_index = 1, 
                    return_counts = 1)
for index, count in zip(i, c):
    if count > 1:
        print([arr[index], count, index])

Out[]:  [2, 4, 8]

A little more compact way of doing it that works for all input types.

Daniel F
  • 13,620
  • 2
  • 29
  • 55