Longest run/island of a number in Python

Question

I have an array with 0's repeating multiple times but I want to find the longest set of 0s. For example:

myArray= [[1,0],[2,0],[3,0][4,0][5,1][6,0][7,0][8,0][9,0][10,0][11,0][12,0][13,1][14,2][15,0][16,0][17,0][18,0][19,1][20,0]]

So the Y coordinate has zeros repeating continuously a few times but I need to find the longest run. In this case the longest run is 7 repeated zeros from X coordinate 6 to X coordinate 12. I'd like the program to tell me where the the longest run of zeros is (in this case from X=6 to X=12).

Thanks!

Divakar · Answer 1 · 2016-06-30T21:33:31.307

Think of the zeros as the ones we are after. So, we will try to assign them as True elements in a boolean array. This could be achieved with myArray[:,1]==0. Next up, let's find rising and falling edges of such regions of 0s. We might have 0s touching the limits/boundaries of the array, so in those cases, we might miss seeing such rising and falling edges. Thus, to cover those boundary cases, we can pad False's or 0's on either sides and then look for positive and negative differentiation values respectively. The indices corresponding to those rising and falling edges must be the start and stop of 0s intervals respectively and finally max of those would be the desired output.

We would have two versions to implement such an idea as listed next.

def max_interval_len_app1(myArray):
    # Construct a grouped differentiated array where 1s indicate start and -1 
    # as stop indices. Then  store such indices.
    grp_diffs = np.diff(np.hstack(([0],(myArray[:,1]==0).astype(int),[0])))
    grp_start_idx = np.where(grp_diffs==1)[0]
    grp_stop_idx = np.where(grp_diffs==-1)[0]

    if len(grp_start_idx)==0:
        return 0 # No zeros found
    else:    
        # Get 0's interval lens by subtracting start from each correspondin
        # stop indices. Return the max length 
        return (grp_stop_idx - grp_start_idx).max()

def max_interval_len_app2(myArray):
    # Get indices at which rising and falling edges occur.
    idx = np.where(np.diff(np.hstack(([False],myArray[:,1]==0,[False]))))[0]

    if len(idx)==0:
        return 0 # No zeros found
    else:        
        # Since the rising and falling edges happen in pairs we are guaranteed
        # to have an even sized idx . So reshape into Nx2 array and then do 
        # differentiation along the columns, which would be 0s interval lens. 
        # Max of the lengths should be the desired output.
        return np.diff(idx.reshape(-1,2),axis=1).max()

Sorry I don't really understand this, can you elaborate a little? What does np.diff do and np.hstack? — Nabeel Eh, Jun 30 '16 at 20:55
*"Since the rising and falling edges happen in pairs we are guaranteed to have an even sized idx"* - not if the sequence starts with zeros but ends with non-zeros (or vice versa) — ali_m, Jun 30 '16 at 21:04
@ali_m Hence the padding of `False/0` added on either sides to make sure we have those rising and falling edges I think. Adding explanation soon. — Divakar, Jun 30 '16 at 21:05
Sorry, didn't spot the padding. This still breaks if there are no non-zeros (you'll get the length of the whole array instead). — ali_m, Jun 30 '16 at 21:11
@ali_m Yup, there should be error-checking for that. Adding it soon. — Divakar, Jun 30 '16 at 21:14

score 0 · Answer 2 · answered Jun 30 '16 at 22:53

You can start by numbering the consecutive elements in a for loop and then simply get the index of the maximum value.

>>> cv, a = [], 0
>>> for x in myArray:
>>>     if x[1] == 0:
>>>         a += 1
>>>     else:
>>>         a = 0
>>>     cv.append(a)
>>> print(a)
[1, 2, 3, 4, 0, 1, 2, 3, 4, 5, 6, 7, 0, 0, 1, 2, 3, 4, 0, 1]

>>> mi = cv.index(max(cv))  # finds index of *first* maximum
>>> run = [mi - cv[mi] + 1, mi]
>>> print(run)
[5, 11]

Longest run/island of a number in Python

2 Answers2

Linked