Find cumsum of subarrays split by indices for numpy array efficiently

Question

Given an array 'array' and a set of indices 'indices', how do I find the cumulative sum of the sub-arrays formed by splitting the array along those indices in a vectorized manner? To clarify, suppose I have:

>>> array = np.arange(20)
>>> array
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
indices = np.arrray([3, 8, 14])

The operation should output:

array([0, 1, 3, 3, 7, 12, 18, 25, 8, 17, 27, 38, 50, 63, 14, 29, 45, 62, 80, 99])

Please note that the array is very big (100000 elements) and so, I need a vectorized answer. Using any loops would slow it down considerably. Also, if I had the same problem, but a 2D array and corresponding indices, and I need to do the same thing for each row in the array, how would I do it?

For the 2D version:

>>>array = np.arange(12).reshape((3,4))
>>>array
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>> indices = np.array([[2], [1, 3], [1, 2]])

The output would be:

array([[ 0,  1,  3,  3],
       [ 4,  9,  6, 13],
       [ 8, 17, 10, 11]])

To clarify: Every row will be split.

score 5 · Accepted Answer · edited May 23 '17 at 10:28

5

You can introduce differentiation of originally cumulatively summed array at indices positions to create a boundary like effect at those places, such that when the differentiated array is cumulatively summed, gives us the indices-stopped cumulatively summed output. This might feel a bit contrived at first-look, but stick with it, try with other samples and hopefully would make sense! The idea is very similar to the one applied in this other MATLAB solution. So, following such a philosophy here's one approach using numpy.diff along with cumulative summation -

# Get linear indices
n = array.shape[1]
lidx = np.hstack(([id*n+np.array(item) for id,item in enumerate(indices)]))

# Get successive differentiations
diffs = array.cumsum(1).ravel()[lidx] - array.ravel()[lidx]

# Get previous group's offsetted summations for each row at all 
# indices positions across the entire 2D array
_,idx = np.unique(lidx/n,return_index=True)
offsetted_diffs = np.diff(np.append(0,diffs))
offsetted_diffs[idx] = diffs[idx]

# Get a copy of input array and place previous group's offsetted summations 
# at indices. Then, do cumulative sum which will create a boundary like 
# effect with those offsets at indices positions.
arrayc = array.copy()
arrayc.ravel()[lidx] -= offsetted_diffs
out = arrayc.cumsum(1)

This should be an almost vectorized solution, almost because even though we are calculating linear indices in a loop, but since it's not the computationally intensive part here, so it's effect on the total runtime would be minimal. Also, you can replace arrayc with array if you don't care about destructing the input for saving on memory.

Sample input, output -

In [75]: array
Out[75]: 
array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20, 21, 22, 23]])

In [76]: indices
Out[76]: array([[3, 6], [4, 7], [5]], dtype=object)

In [77]: out
Out[77]: 
array([[ 0,  1,  3,  3,  7, 12,  6, 13],
       [ 8, 17, 27, 38, 12, 25, 39, 15],
       [16, 33, 51, 70, 90, 21, 43, 66]])

edited May 23 '17 at 10:28

Community

1
1

answered Dec 30 '15 at 09:27

Divakar

218,885
19
262
358

Would this work for 2D arrays? Would the indexing remain the same? – Aditya369 Dec 30 '15 at 09:31
@Aditya369 Add a 2D sample case and expected output in the question? – Divakar Dec 30 '15 at 09:31
I did add that. Did you want me to add another one? – Aditya369 Dec 30 '15 at 09:32
@Aditya369 What's the shape of 2D array and number of elements in `indices` in your actual 2D case? – Divakar Dec 30 '15 at 10:42
300*100000 is the shape of array and number of indices is between 3000 to 20000 for each row, so that's 900000 to 6000000 indices in total. – Aditya369 Dec 30 '15 at 10:44
@Aditya369 So, since the number of rows is just `300`, you can just loop through rows with the listed approach. – Divakar Dec 30 '15 at 10:45
Yeah. That's what I'm doing for now. I'll see if I can get a better answer for 2D version and if not, accept this. Your solution will be faster than the other solution too. – Aditya369 Dec 30 '15 at 10:50
Can we ravel this, perform the operation on modified indices, then unravel it again? – Aditya369 Dec 30 '15 at 10:53
Are you taking a copy of the array because you want to preserve the initial array? – Aditya369 Dec 30 '15 at 11:07
@Aditya369 Check out the edits to solve for a 2D case! – Divakar Dec 30 '15 at 19:00
Can you please try and solve the follow-up question? The link is http://stackoverflow.com/questions/34552304/python-given-numpy-array-of-weights-find-indices-which-split-array-so-that-sum – Aditya369 Dec 31 '15 at 23:09

Mazdak · Answer 2 · 2015-12-30T20:10:59.957

3

You can use np.split to split your array along the indices then using python built in function map apply the np.cumsum() to your sub arrays. And at the end by using np.hstack convert the result to an integrated array:

>>> np.hstack(map(np.cumsum,np.split(array,indices)))
array([ 0,  1,  3,  3,  7, 12, 18, 25,  8, 17, 27, 38, 50, 63, 14, 29, 45,
       62, 80, 99])

Note that since map is a built in function in python and has been implemented in C inside the Python interpreter it would performs better than a regular loop.¹

Here is an alternative for 2D arrays:

>>> def func(array,indices):
...     return np.hstack(map(np.cumsum,np.split(array,indices)))
... 
>>> 
>>> array
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>> 
>>> indices
array([[2], [1, 3], [1, 2]], dtype=object)

>>> np.array([func(arr,ind) for arr,ind in np.array((array,indices)).T])
array([[ 0,  1,  2,  5],
       [ 4,  5, 11,  7],
       [ 8,  9, 10, 21]])

Note that your expected output is not based on the way that np.split works.

If you want to such results you need to add 1 to your indices :

>>> indices = np.array([[3], [2, 4], [2, 3]], dtype=object)
>>> 
>>> np.array([func(arr,ind) for arr,ind in np.array((array,indices)).T])
array([[  0.,   1.,   3.,   3.],
       [  4.,   9.,   6.,  13.],
       [  8.,  17.,  10.,  11.]])

Due to a comment which said there is not performance difference between using generator expression and map function I ran a benchmark which demonstrates result better.

# Use map
~$ python -m timeit --setup "import numpy as np;array = np.arange(20);indices = np.array([3, 8, 14])" "np.hstack(map(np.cumsum,np.split(array,indices)))"
10000 loops, best of 3: 72.1 usec per loop
# Use generator expression
~$ python -m timeit --setup "import numpy as np;array = np.arange(20);indices = np.array([3, 8, 14])" "np.hstack(np.cumsum(a) for a in np.split(array,indices))"
10000 loops, best of 3: 81.2 usec per loop

_{Note that this doesn't mean that using map which performs in C speed makes that code preforms in C speed. That's because of that, the code has implemented in python and calling the function (first argument) and applying it on iterable items would take time.}

edited Dec 30 '15 at 20:10

answered Dec 30 '15 at 07:52

Mazdak

105,000
18
159
188

1

This is brilliant!!!!! This makes things so much better for me. Is there any way to do this for a 2D matrix? – Aditya369 Dec 30 '15 at 07:58
@Aditya369 How you want to split a 2D matrix with `indices` array? do you want to split all of the rows or ? – Mazdak Dec 30 '15 at 08:00
I have an array of arrays as so: indices = np.ndarray([[3, 8, 14], [4, 9, 17]]) corresponding to an array with two rows. So [3, 8, 14] is for the first row, and [4, 9, 17] is for the second row. – Aditya369 Dec 30 '15 at 08:01
@Aditya369 Please add it to your question with a sample input. – Mazdak Dec 30 '15 at 08:07
To answer you, all rows will be split. – Aditya369 Dec 30 '15 at 08:28
1

It is very misleading to claim that `map` *"performs in C speed"*. There is essentially no difference in performance between `np.hstack(map(np.cumsum,np.split(array,indices)))` and a regular `for` loop or generator expression, e.g. `np.hstack(np.cumsum(a) for a in np.split(array, indices))`. – ali_m Dec 30 '15 at 17:57
@ali_m *very misleading*?? Python built-in functions are [implemented in C inside the Python interpreter](https://hg.python.org/cpython/file/f2e6c33ce3e9/Python/bltinmodule.c#l929). Besides, that you wasn't aware of this point, about the benchmark you should try to benchmark with large datasets to find the difference, and about the generator exp you are wrong because each generator exp would be converted to a generator function by python and calling the `__iter__` and `__next__` attributes would makes the code to takes more time than even a regular loop. – Mazdak Dec 30 '15 at 19:36
1

Even for a 1000000-long array with 100 indices I see a difference of about 13 ms in runtime between using `map` and a generator expression. That amounts to about a 14% speed-up - hardly the difference between "Python speed" and "C speed". By your logic, any piece of Python code that uses exclusively CPython builtins should run "at C speed", which is obviously ridiculous. – ali_m Dec 30 '15 at 19:53
@ali_m Indeed, I think you miss interpreted my answer, (I wrote it bad!) I said `map` performs in C speed but not my code!! which is implemented in python and need to apply the numpy function each time on iterable elements. Anyway I'll update the answer to refusing of such miss interpreting in future. – Mazdak Dec 30 '15 at 20:04

Find cumsum of subarrays split by indices for numpy array efficiently

2 Answers2

Linked