Say that I have a NumPy array:
a = np.array([0, 1, 2, 2, 3, 4, 5, 5, 6, 7, 8, 9, 9, 9, 10, 11, 12, 13, 13, 13, 14, 15])
And I have a length m = 2
that the user specifies in order to see if there are any repeats of that length within the time series. In this case, the repeats of length m = 2
are:
[2, 2]
[5, 5]
[9, 9]
[9, 9]
[13, 13]
And the user can change this to m = 3
and the repeats of length m = 3
are:
[9, 9, 9]
[13, 13, 13]
I need a function that either returns the index of where a repeat is found or None
. So, for m = 3
the function would return the following NumPy array of starting indices:
[11, 17]
And for m = 4
the function would return None
. What's the cleanest and fastest way to accomplish this?
Update
Note that the array does not have to be sorted and we are not interested in the result after a sort. We only want the result from the unsorted array. Your result for m = 2
should be the same for this array:
b = np.array([0, 11, 2, 2, 3, 40, 5, 5, 16, 7, 80, 9, 9, 9, 1, 11, 12, 13, 13, 13, 4, 5])