Accumulate the sum of irregular slices in an array

Question

I need to quickly process a huge two-dimensional array and have already pre-marked the required data.

array([[  0.,   1.,   2.,   3.,   4.,   5. ,   6. ,   7.],
       [  6.,   7.,   8.,   9.,  10.,   4.2,   4.3,  11.],
       [ 12.,  13.,  14.,  15.,  16.,   4.2,   4.3,  17.],
       [ 18.,  19.,  20.,  21.,  22.,   4.2,   4.3,  23.]])

array([[False, True,  True,  True, False, True, True , False],
       [False, False, False, True,  True, True, True , False],
       [False, False, True, True, False, False, False, False],
       [False, True, True, False, False, False, True , True ]])

I expect to sum up the data of the markers in each row of the array.But np.cumsum can't do this, I need solutions or good ideas, thanks

Expected output：

array([[  0.,   1.,   3.,   6.,   0.,   5. ,   11. ,    0.],
       [  0.,   0.,   0.,   9.,  19.,  23.2,   27.5,    0.],
       [  0.,   0.,  14.,  29.,   0.,     0,      0,    0.],
       [  0.,  19.,  39.,   0.,   0.,     0,     4.3, 27.3]])

The difficulty of the solution is that each fragment cannot contain the result of the previous fragment

def mask_to_size(self,axis=-1):
    if self.ndim==2:
        if axis == 0:
            mask = np.zeros((self.shape[0]+1,self.shape[1]), dtype=bool)
            mask[:-1] = self ; mask[0] = False ; mask = mask.ravel('F') 
        else:
            mask = np.zeros((self.shape[0],self.shape[1]+1), dtype=bool)
            mask[:,0:-1]= self ;mask[:,0]=False; mask = mask.ravel('C')
    else:
        mask = np.zeros((self.shape[0]+1), dtype=bool)
        mask[:-1] = self ; mask[0] = False
    return np.diff(np.nonzero(mask[1:]!= mask[:-1])[0])[::2].astype(int)

# https://stackoverflow.com/a/49179628/  by @Divakar
def intervaled_cumsum(ar, sizes):
    out = ar.copy() 
    arc = ar.cumsum() ; idx = sizes.cumsum()
    out[idx[0]] = ar[idx[0]] - arc[idx[0]-1]
    out[idx[1:-1]] = ar[idx[1:-1]] - np.diff(arc[idx[:-1]-1])
    return out.cumsum()  

def cumsum_masked(self,mask,axis=-1):
    sizes = mask_to_size(mask,axis);out = np.zeros(self.size);shape = self.shape
    if len(shape)==2:
        if axis == 0:
            mask = mask.ravel('F') ; self = self.ravel('F')
        else:
            mask = mask.ravel('C') ; self = self.ravel('C')
    out[mask] = intervaled_cumsum(self[mask],sizes)
    if len(shape)==2:
        if axis == 0:
            return out.reshape(shape[1],shape[0]).T
        else:
            return out.reshape(shape)
    return out

cumsum_masked(a,m,axis=1)

I sorted out the answers and tried to optimize the speed but it didn't work.I think other people may need it.

The basic idea that I've seen in previous questions is to construct a staircase array based on the `cumsum` at the `False` elements. Subtract that from the regular `cumsum` to produce the desired sawtooth sum. So you first row is `x.cumsum()-np.array([[0,0,0,0,10,10,10,28])`. — hpaulj, May 09 '18 at 17:34
When you mention huge in `huge two-dimensional array `, typically how many rows are you expecting and what's the length of each row, i.e number of columns? — Divakar, May 09 '18 at 22:15

filippo · Answer 1 · 2018-05-10T08:17:19.217

Here's an attempt to implement @hpaulj suggestion

>>> a = np.array([[ 0. ,  1. ,  2. ,  3. ,  4. ,  5. ,  6. ,  7. ],
...               [ 6. ,  7. ,  8. ,  9. , 10. ,  4.2,  4.3, 11. ],
...               [12. , 13. , 14. , 15. , 16. ,  4.2,  4.3, 17. ],
...               [18. , 19. , 20. , 21. , 22. ,  4.2,  4.3, 23. ]])

>>> m = np.array([[False,  True,  True,  True, False,  True,  True, False],
...               [False, False, False,  True,  True,  True,  True, False],
...               [False, False,  True,  True, False, False, False, False],
...               [False,  True,  True, False, False, False,  True,  True]])

>>> np.maximum.accumulate(np.cumsum(a, axis=1)*~m, axis=1)
array([[  0. ,   0. ,   0. ,   0. ,  10. ,  10. ,  10. ,  28. ],
       [  6. ,  13. ,  21. ,  21. ,  21. ,  21. ,  21. ,  59.5],
       [ 12. ,  25. ,  25. ,  25. ,  70. ,  74.2,  78.5,  95.5],
       [ 18. ,  18. ,  18. ,  78. , 100. , 104.2, 104.2, 104.2]])

>>> np.cumsum(a, axis=1) - np.maximum.accumulate(np.cumsum(a, axis=1)*~m, axis=1)
array([[ 0. ,  1. ,  3. ,  6. ,  0. ,  5. , 11. ,  0. ],
       [ 0. ,  0. ,  0. ,  9. , 19. , 23.2, 27.5,  0. ],
       [ 0. ,  0. , 14. , 29. ,  0. ,  0. ,  0. ,  0. ],
       [ 0. , 19. , 39. ,  0. ,  0. ,  0. ,  4.3, 27.3]])

See also Most efficient way to forward-fill NaN values in numpy array which seems somewhat related, especially if your array is not >= 0 like in this toy example, the approved answer there should be helpful.

EDIT For future reference here's a version that removes the above >= 0 assumption. Should still be pretty fast, didn't benchmark it against the other methods though.

In [38]: def masked_cumsum(a, m):                                                               
    ...:     idx = np.maximum.accumulate(np.where(m, 0, np.arange(m.size).reshape(m.shape)), axis=1)
    ...:     c = np.cumsum(a, axis=-1)                                                    
    ...:     return c - c[np.unravel_index(idx, m.shape)]
    ...: 

In [43]: masked_cumsum(-a, m)
Out[43]: 
array([[  0. ,  -1. ,  -3. ,  -6. ,   0. ,  -5. , -11. ,   0. ],
       [  0. ,   0. ,   0. ,  -9. , -19. , -23.2, -27.5,   0. ],
       [  0. ,   0. , -14. , -29. ,   0. ,   0. ,   0. ,   0. ],
       [  0. , -19. , -39. ,   0. ,   0. ,   0. ,  -4.3, -27.3]])

I like this answer very simply but there is a negative value in the actual data. — weidong, May 10 '18 at 07:17
@weidong just for completeness, I updated the answer to work with negative data, not so simple as before but still compact and fast enough — filippo, May 10 '18 at 08:16

score 1 · Accepted Answer · answered May 09 '18 at 18:37

There's intervaled_cumsum for 1D arrays. For this case, we simply need to get the masked elements and setup their island lengths and feed it to that function.

Hence, one vectorized approach would be -

# https://stackoverflow.com/a/49179628/  by @Divakar
def intervaled_cumsum(ar, sizes):
    # Make a copy to be used as output array
    out = ar.copy()

    # Get cumumlative values of array
    arc = ar.cumsum()

    # Get cumsumed indices to be used to place differentiated values into
    # input array's copy
    idx = sizes.cumsum()

    # Place differentiated values that when cumumlatively summed later on would
    # give us the desired intervaled cumsum
    out[idx[0]] = ar[idx[0]] - arc[idx[0]-1]
    out[idx[1:-1]] = ar[idx[1:-1]] - np.diff(arc[idx[:-1]-1])
    return out.cumsum()  

def intervaled_cumsum_masked_rowwise(a, mask):
    z = np.zeros((mask.shape[0],1), dtype=bool)
    maskz = np.hstack((z,mask,z))

    out = np.zeros_like(a)
    sizes = np.diff(np.flatnonzero(maskz[:,1:] != maskz[:,:-1]))[::2]
    out[mask] = intervaled_cumsum(a[mask], sizes)
    return out

Sample run -

In [95]: a
Out[95]: 
array([[ 0. ,  1. ,  2. ,  3. ,  4. ,  5. ,  6. ,  7. ],
       [ 6. ,  7. ,  8. ,  9. , 10. ,  4.2,  4.3, 11. ],
       [12. , 13. , 14. , 15. , 16. ,  4.2,  4.3, 17. ],
       [18. , 19. , 20. , 21. , 22. ,  4.2,  4.3, 23. ]])

In [96]: mask
Out[96]: 
array([[False,  True,  True,  True, False,  True,  True, False],
       [False, False, False,  True,  True,  True,  True, False],
       [False, False,  True,  True, False, False, False, False],
       [False,  True,  True, False, False, False,  True,  True]])

In [97]: intervaled_cumsum_masked_rowwise(a, mask)
Out[97]: 
array([[ 0. ,  1. ,  3. ,  6. ,  0. ,  5. , 11. ,  0. ],
       [ 0. ,  0. ,  0. ,  9. , 19. , 23.2, 27.5,  0. ],
       [ 0. ,  0. , 14. , 29. ,  0. ,  0. ,  0. ,  0. ],
       [ 0. , 19. , 39. ,  0. ,  0. ,  0. ,  4.3, 27.3]])

Works just as well with negative numbers -

In [109]: a = -a

In [110]: a
Out[110]: 
array([[ -0. ,  -1. ,  -2. ,  -3. ,  -4. ,  -5. ,  -6. ,  -7. ],
       [ -6. ,  -7. ,  -8. ,  -9. , -10. ,  -4.2,  -4.3, -11. ],
       [-12. , -13. , -14. , -15. , -16. ,  -4.2,  -4.3, -17. ],
       [-18. , -19. , -20. , -21. , -22. ,  -4.2,  -4.3, -23. ]])

In [111]: intervaled_cumsum_masked_rowwise(a, mask)
Out[111]: 
array([[  0. ,  -1. ,  -3. ,  -6. ,   0. ,  -5. , -11. ,   0. ],
       [  0. ,   0. ,   0. ,  -9. , -19. , -23.2, -27.5,   0. ],
       [  0. ,   0. , -14. , -29. ,   0. ,   0. ,   0. ,   0. ],
       [  0. , -19. , -39. ,   0. ,   0. ,   0. ,  -4.3, -27.3]])

I found that when axis=0 was calculated along the axis, the speed would be very slow. — weidong, May 10 '18 at 12:53

Paul Panzer · Answer 3 · 2018-05-09T19:26:53.277

Here is an approach that is quite a bit slower than @Divakar's and @filippo's but more robust. The problem with "global cumsummy" approaches is that they can suffer from loss of significance, see below:

import numpy as np
from scipy import linalg

def cumsums(data, mask, break_lines=True):
    dr = data[mask]
    if break_lines:
        msk = mask.copy()
        msk[:, 0] = False
        mr = msk.ravel()[1:][mask.ravel()[:-1]][:dr.size-1]
    else:
        mr = mask.ravel()[1:][mask.ravel()[:-1]][:dr.size-1]
    D = np.empty((2, dr.size))
    D.T[...] = 1, 0
    D[1, :-1] -= mr
    out = np.zeros_like(data)
    out[mask] = linalg.solve_banded((1, 0), D, dr)
    return out

def f_staircase(a, m):
    return np.cumsum(a, axis=1) - np.maximum.accumulate(np.cumsum(a, axis=1)*~m, axis=1)

# https://stackoverflow.com/a/49179628/  by @Divakar
def intervaled_cumsum(ar, sizes):
    # Make a copy to be used as output array
    out = ar.copy()

    # Get cumumlative values of array
    arc = ar.cumsum()

    # Get cumsumed indices to be used to place differentiated values into
    # input array's copy
    idx = sizes.cumsum()

    # Place differentiated values that when cumumlatively summed later on would
    # give us the desired intervaled cumsum
    out[idx[0]] = ar[idx[0]] - arc[idx[0]-1]
    out[idx[1:-1]] = ar[idx[1:-1]] - np.diff(arc[idx[:-1]-1])
    return out.cumsum()  

def intervaled_cumsum_masked_rowwise(a, mask):
    z = np.zeros((mask.shape[0],1), dtype=bool)
    maskz = np.hstack((z,mask,z))

    out = np.zeros_like(a)
    sizes = np.diff(np.flatnonzero(maskz[:,1:] != maskz[:,:-1]))[::2]
    out[mask] = intervaled_cumsum(a[mask], sizes)
    return out  

data = np.array([[  0.,   1.,   2.,   3.,   4.,   5. ,   6. ,   7.],
                 [  6.,   7.,   8.,   9.,  10.,   4.2,   4.3,  11.],
                 [ 12.,  13.,  14.,  15.,  16.,   4.2,   4.3,  17.],
                 [ 18.,  19.,  20.,  21.,  22.,   4.2,   4.3,  23.]])

mask = np.array([[False, True,  True,  True, False, True, True , False],
                 [False, False, False, True,  True, True, True , False],
                 [False, False, True, True, False, False, False, False],
                 [False, True, True, False, False, False, True , True ]])

from timeit import timeit

print('fast?')
print('filippo', timeit(lambda: f_staircase(data, mask), number=1000))
print('pp     ', timeit(lambda: cumsums(data, mask), number=1000))
print('divakar', timeit(lambda: intervaled_cumsum_masked_rowwise(data, mask), number=1000))

data = np.random.uniform(-10, 10, (5000, 5000))
mask = np.random.random((5000, 5000)) < 0.125
mask[:, 1:] |= mask[:, :-1]
mask[:, 2:] |= mask[:, :-2]

print()
print('fast on large data?')
print('filippo', timeit(lambda: f_staircase(data, mask), number=3))
print('pp     ', timeit(lambda: cumsums(data, mask), number=3))
print('divakar', timeit(lambda: intervaled_cumsum_masked_rowwise(data, mask), number=3))

data = np.random.uniform(-10, 10, (10000, 10000))
mask = np.random.random((10000, 10000)) < 0.025
mask[:, 1:] |= mask[:, :-1]
mask[:, 2:] |= mask[:, :-2]

print()
print('fast on large sparse data?')
print('filippo', timeit(lambda: f_staircase(data, mask), number=3))
print('pp     ', timeit(lambda: cumsums(data, mask), number=3))
print('divakar', timeit(lambda: intervaled_cumsum_masked_rowwise(data, mask), number=3))

data = np.exp(-np.linspace(-24, 24, 100))[None]
mask = (np.arange(100) % 4).astype(bool)[None]

print()
print('numerically sound?')
print('correct', data[0, -3:].sum())
print('filippo', f_staircase(data, mask)[0,-1]) 
print('pp     ', cumsums(data, mask)[0,-1])
print('divakar', intervaled_cumsum_masked_rowwise(data, mask)[0,-1])

Output:

fast?
filippo 0.008435532916337252
pp      0.07329772273078561
divakar 0.0336935929954052

fast on large data?
filippo 1.6037923698313534
pp      3.982803522143513
divakar 1.706403402145952

fast on large sparse data?
filippo 6.11361704999581
pp      4.717669038102031
divakar 2.9474888620898128

numerically sound?
correct 1.9861262739950047e-10
filippo 0.0
pp      1.9861262739950047e-10
divakar 9.737630365237156e-06

We see that with the falling exponential example the cumsum based approaches don't work. Obviously, this is an engineered example, but it showcases a real problem.

Thank you for your testing and pointing. I'm afraid I think I need more speed. But I will keep and learn your code to use him when appropriate. — weidong, May 10 '18 at 07:19

Accumulate the sum of irregular slices in an array

3 Answers3