0

I have some 2D-arrays filled with 0 and 1:

import numpy as np

a = np.random.randint(2, size=(20, 20))
b = np.random.randint(2, size=(20, 20))
c = np.random.randint(2, size=(20, 20))
d = np.random.randint(2, size=(20, 20)) 

and I want to count the consecutive occurrence of the ones with periodic boundaries. That means (in 1D for clearness):

[1 1 0 0 1 1 0 1 1 1]

should give me 5(last three elements + first two).
The 2D-arrays should be compared/counted in the third (second if you start with 0) axis, like first stacking the arrays in axis=2 and then applying the same algorithm like for 1D. But I am not sure if this is the most simple way.

clearseplex
  • 679
  • 1
  • 7
  • 22

3 Answers3

1

You can use groupby from itertools:

from itertools import groupby

a = [1, 1, 0, 0, 1, 1, 0, 1, 1, 1]

def get_longest_seq(a):
    if all(a):
        return len(a)

    a_lens = [len(list(it)) for k, it in groupby(a) if k != 0]

    if a[0] == 1 and a[-1] == 1:
        m = max(max(a_lens), a_lens[0] + a_lens[-1])
    else:
        m = max(a_lens)
    return m

print(get_longest_seq(a))
LeoE
  • 2,054
  • 1
  • 11
  • 29
1

Here's one way for ndarrays a of 2D and higher dim arrays, meant for performance efficiency -

def count_periodic_boundary(a):
    a = a.reshape(-1,a.shape[-1])
    m = a==1    
    c0 = np.flip(m,axis=-1).argmin(axis=-1)+m.argmin(axis=-1)
    z = np.zeros(a.shape[:-1]+(1,),dtype=bool)
    p = np.hstack((z,m,z))
    c = (p[:,:-1]<p[:,1:]).sum(1)
    s = np.r_[0,c[:-1].cumsum()]
    l = np.diff(np.flatnonzero(np.diff(p.ravel())))[::2]
    d = np.maximum(c0,np.maximum.reduceat(l,s))    
    return np.where(m.all(-1),a.shape[-1],d)

Sample runs -

In [75]: np.random.seed(0)
    ...: a = np.random.randint(2, size=(5, 20))

In [76]: a
Out[76]: 
array([[0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1],
       [0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0],
       [0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1],
       [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0],
       [0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0]])

In [77]: count_periodic_boundary(a)
Out[77]: array([7, 4, 5, 2, 6])


In [72]: np.random.seed(0)
    ...: a = np.random.randint(2, size=(2, 5, 20))

In [73]: a
Out[73]: 
array([[[0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1],
        [0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0],
        [0, 1, 0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 0, 0, 1],
        [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 1, 0],
        [0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0]],

       [[1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0],
        [1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0],
        [1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 1],
        [0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0],
        [1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0]]])

In [74]: count_periodic_boundary(a)
Out[74]: array([7, 4, 5, 2, 6, 2, 5, 4, 2, 1])
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • Can you explain the output? In ```[84]``` you have 5 elements in the outputs (I assume because of 5 rows) but I can't explain the values. In the first row for example the longest sequence of ones is 7, isn't it? – clearseplex Nov 26 '19 at 18:42
  • @clearseplex I thought you were only concerned by the boundary elements, aren't you? So, you want the biggest island length of 1s in each row and accounting for the boundary 1s wrapped around? – Divakar Nov 26 '19 at 18:44
  • Yes the biggest "island" with taking into account the boundaries, sorry if it was not clear from my post. – clearseplex Nov 26 '19 at 18:46
  • How to account for cases where there are only 0's? – clearseplex Dec 17 '19 at 07:04
  • @clearseplex Use `np.where((a==0).all(axis=-1),0,out) `, where `out` is the output off `count_periodic_boundary`. – Divakar Dec 17 '19 at 07:09
  • I mean the functions fails for inputs with only ```0``` in the line ```d = np.maximum(c0,np.maximum.reduceat(l,s)) ``` because ```l=[]``` – clearseplex Dec 17 '19 at 07:38
1

Here is a two-liner, admittedly containing one rather long line:

*m,n = a.shape
return np.minimum(n,(np.arange(1,2*n+1)-np.maximum.accumulate(np.where(a[...,None,:],0,np.arange(1,2*n+1).reshape(2,n)).reshape(*m,2*n),-1)).max(-1))

How it works:

Let's first ignore the wrap around and consider a simple example: a = [1 0 0 1 1 0 1 1 1 0] We want to transform this into b = [1 0 0 1 2 0 1 2 3 0], so we can simply take the maximum. One way of generating b is taking the arange r = [1 2 3 4 5 6 7 8 9 10] and subtracting aux = [0 2 3 3 3 6 6 6 6 10]. aux we create by multiplying r with (1-a) yielding [0 2 3 0 0 6 0 0 0 10] and taking the cumulative maximum.

To deal with the wrap around we simply put two copies of a next to each other and then use the above.

Here is the code again broken down into smaller bits and commented:

*m,n = a.shape
# r has length 2*n because of how we deal with the wrap around
r = np.arange(1,2*n+1)
# create r x (1-a) using essentially np.where(a,0,r)
# it's a bit more involved because we are cloning a in the same step
# a will be doubled along a new axis we insert before the last one
# this will happen by means of broadcasting against r which we distribute
# over two rows along the new axis
# in the very end we merge the new and the last axis
r1_a = np.where(a[...,None,:],0,r.reshape(2,n)).reshape(*m,2*n)
# take cumulative max
aux = np.maximum.accumulate(r1_a,-1)
# finally, take the row wise maximum and deal with all-one rows
return np.minimum(n,(r-aux).max(-1))
Paul Panzer
  • 51,835
  • 3
  • 54
  • 99