Consecutive values in array with periodic boundaries in Python

Question

I have some 2D-arrays filled with 0 and 1:

import numpy as np

a = np.random.randint(2, size=(20, 20))
b = np.random.randint(2, size=(20, 20))
c = np.random.randint(2, size=(20, 20))
d = np.random.randint(2, size=(20, 20))

and I want to count the consecutive occurrence of the ones with periodic boundaries. That means (in 1D for clearness):

[1 1 0 0 1 1 0 1 1 1]

should give me 5(last three elements + first two).
The 2D-arrays should be compared/counted in the third (second if you start with 0) axis, like first stacking the arrays in axis=2 and then applying the same algorithm like for 1D. But I am not sure if this is the most simple way.

Could you elaborate on what do you mean by `periodic boundaries`? — Divakar, Nov 26 '19 at 14:42
@Divakar In the above example the longest sequence of ones would be 3, but with periodic boundaries it is 5. I mean that the end is again "attached" at the first element like a in a circle. — clearseplex, Nov 26 '19 at 14:45
@a_guest Good point! I would expect 3, but didn't consider this case above to be honest. — clearseplex, Nov 26 '19 at 14:48
@clearseplex Not sure about how you generalize from 1D to 2D. Why do you add an axis when going in 2D? — ndrplz, Nov 26 '19 at 14:48
@ndrplz I have monthly data (12) each as 2D-array (filled with 1 and 0). Let's say I have for the ```[i,j]```element in month 1 the value ```0```, second month ```1```and so on. Then I have a sequence of ones and zeros in the 3rd axis. — clearseplex, Nov 26 '19 at 14:51
Can you add an example for the data and the desired output in 2D? — LeoE, Nov 26 '19 at 14:58
Do you only want the wrap around or the longest overall? For example with [1 0 0 1 1 1 0 0 1] do you want 2 or 3? — Paul Panzer, Nov 26 '19 at 18:39

score 1 · Answer 1 · answered Nov 26 '19 at 14:54

You can use groupby from itertools:

from itertools import groupby

a = [1, 1, 0, 0, 1, 1, 0, 1, 1, 1]

def get_longest_seq(a):
    if all(a):
        return len(a)

    a_lens = [len(list(it)) for k, it in groupby(a) if k != 0]

    if a[0] == 1 and a[-1] == 1:
        m = max(max(a_lens), a_lens[0] + a_lens[-1])
    else:
        m = max(a_lens)
    return m

print(get_longest_seq(a))

Divakar · Accepted Answer · 2019-11-26T20:32:31.317

Here's one way for ndarrays a of 2D and higher dim arrays, meant for performance efficiency -

def count_periodic_boundary(a):
    a = a.reshape(-1,a.shape[-1])
    m = a==1    
    c0 = np.flip(m,axis=-1).argmin(axis=-1)+m.argmin(axis=-1)
    z = np.zeros(a.shape[:-1]+(1,),dtype=bool)
    p = np.hstack((z,m,z))
    c = (p[:,:-1]<p[:,1:]).sum(1)
    s = np.r_[0,c[:-1].cumsum()]
    l = np.diff(np.flatnonzero(np.diff(p.ravel())))[::2]
    d = np.maximum(c0,np.maximum.reduceat(l,s))    
    return np.where(m.all(-1),a.shape[-1],d)

Sample runs -

In [75]: np.random.seed(0)
    ...: a = np.random.randint(2, size=(5, 20))

In [76]: a
Out[76]: 
array([[0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1],
       [0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0],
       [0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1],
       [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0],
       [0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0]])

In [77]: count_periodic_boundary(a)
Out[77]: array([7, 4, 5, 2, 6])


In [72]: np.random.seed(0)
    ...: a = np.random.randint(2, size=(2, 5, 20))

In [73]: a
Out[73]: 
array([[[0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1],
        [0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0],
        [0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1],
        [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0],
        [0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0]],

       [[1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0],
        [1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0],
        [1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1],
        [0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0],
        [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0]]])

In [74]: count_periodic_boundary(a)
Out[74]: array([7, 4, 5, 2, 6, 2, 5, 4, 2, 1])

Can you explain the output? In ```[84]``` you have 5 elements in the outputs (I assume because of 5 rows) but I can't explain the values. In the first row for example the longest sequence of ones is 7, isn't it? — clearseplex, Nov 26 '19 at 18:42
@clearseplex I thought you were only concerned by the boundary elements, aren't you? So, you want the biggest island length of 1s in each row and accounting for the boundary 1s wrapped around? — Divakar, Nov 26 '19 at 18:44
Yes the biggest "island" with taking into account the boundaries, sorry if it was not clear from my post. — clearseplex, Nov 26 '19 at 18:46
@clearseplex Use `np.where((a==0).all(axis=-1),0,out) `, where `out` is the output off `count_periodic_boundary`. — Divakar, Dec 17 '19 at 07:09
I mean the functions fails for inputs with only ```0``` in the line ```d = np.maximum(c0,np.maximum.reduceat(l,s)) ``` because ```l=[]``` — clearseplex, Dec 17 '19 at 07:38

Paul Panzer · Answer 3 · 2019-11-26T19:21:47.950

Here is a two-liner, admittedly containing one rather long line:

*m,n = a.shape
return np.minimum(n,(np.arange(1,2*n+1)-np.maximum.accumulate(np.where(a[...,None,:],0,np.arange(1,2*n+1).reshape(2,n)).reshape(*m,2*n),-1)).max(-1))

How it works:

Let's first ignore the wrap around and consider a simple example: a = [1 0 0 1 1 0 1 1 1 0] We want to transform this into b = [1 0 0 1 2 0 1 2 3 0], so we can simply take the maximum. One way of generating b is taking the arange r = [1 2 3 4 5 6 7 8 9 10] and subtracting aux = [0 2 3 3 3 6 6 6 6 10]. aux we create by multiplying r with (1-a) yielding [0 2 3 0 0 6 0 0 0 10] and taking the cumulative maximum.

To deal with the wrap around we simply put two copies of a next to each other and then use the above.

Here is the code again broken down into smaller bits and commented:

*m,n = a.shape
# r has length 2*n because of how we deal with the wrap around
r = np.arange(1,2*n+1)
# create r x (1-a) using essentially np.where(a,0,r)
# it's a bit more involved because we are cloning a in the same step
# a will be doubled along a new axis we insert before the last one
# this will happen by means of broadcasting against r which we distribute
# over two rows along the new axis
# in the very end we merge the new and the last axis
r1_a = np.where(a[...,None,:],0,r.reshape(2,n)).reshape(*m,2*n)
# take cumulative max
aux = np.maximum.accumulate(r1_a,-1)
# finally, take the row wise maximum and deal with all-one rows
return np.minimum(n,(r-aux).max(-1))

Consecutive values in array with periodic boundaries in Python

3 Answers3

Linked