Restart cumsum and get index if cumsum more than value

Question

Say I have an array of distances x=[1,2,1,3,3,2,1,5,1,1].

I want to get the indices from x where cumsum reaches 10, in this case, idx=[4,9].

So the cumsum restarts after the condition are met.

I can do it with a loop, but loops are slow for large arrays and I was wondering if I could do it in a vectorized way.

@yatu I think I answer those type question before :-) ...Better solution maybe using numba .. — BENY, Jul 05 '19 at 14:04
If you want to reset `cumsum` whenever 10 is reached, something like this could work: `x.cumsum()%10`. If `x` is an array this is really fast. But also this may not work for you, because how you want to handle edge cases is unclear. Like, if `cumsum` is 11, should it reset to 0 or 1? — Brenlla, Jul 05 '19 at 14:29
No @brenlla that doesn't quite work as it doesn't take into account the remainder of the previous value that may have surpassed 10, try for instance with `[1,2,1,3,6,2,1,5,1,1]` — yatu, Jul 05 '19 at 14:32
It requires a loop regardless. But for speed you want to loop in fast compiled code. For inherently sequential tasks like this the stock tools of numpy are more limited. — hpaulj, Jul 05 '19 at 14:51
It's not clear to me what is actually being asked, i.e. why does @Brenlla's suggestion not provide a route to a good, fast answer? — T_M, Jun 18 '20 at 09:09

score 13 · Answer 1 · answered Jul 05 '19 at 14:22

13

A fun method

sumlm = np.frompyfunc(lambda a,b:a+b if a < 10 else b,2,1)
newx=sumlm.accumulate(x, dtype=np.object)
newx
array([1, 3, 4, 7, 10, 2, 3, 8, 9, 10], dtype=object)
np.nonzero(newx==10)

(array([4, 9]),)

answered Jul 05 '19 at 14:22

BENY

317,841
20
164
234

Check with `np.random.seed([3, 1415]); x = np.random.randint(100, size=1_000_000).tolist()` – piRSquared Jul 05 '19 at 14:40
In other testing `frompyfunc` tends to be 2x faster than more explicit python loops. – hpaulj Jul 05 '19 at 15:51
I didn't realize that `frompyfunc` produced a `ufunc` with methods like `accumulate`. – hpaulj Jul 06 '19 at 04:14
@hpaulj ah just find out it in the morning :-) – BENY Jul 06 '19 at 04:45

piRSquared · Answer 2 · 2019-07-05T15:03:24.280

10

Loops are not always bad (especially when you need one). Also, There is no tool or algorithm that will make this quicker than O(n). So let's just make a good loop.

Generator Function

def cumsum_breach(x, target):
    total = 0
    for i, y in enumerate(x):
        total += y
        if total >= target:
            yield i
            total = 0

list(cumsum_breach(x, 10))

[4, 9]

Just In Time compiling with Numba

Numba is a third party library that needs to be installed.
Numba can be persnickety about what features are supported. But this works.
Also, as pointed out by Divakar, Numba performs better with arrays

from numba import njit

@njit
def cumsum_breach_numba(x, target):
    total = 0
    result = []
    for i, y in enumerate(x):
        total += y
        if total >= target:
            result.append(i)
            total = 0

    return result

cumsum_breach_numba(x, 10)

Testing the Two

Because I felt like it ¯\_(ツ)_/¯

Setup

np.random.seed([3, 1415])
x0 = np.random.randint(100, size=1_000_000)
x1 = x0.tolist()

Accuracy

i0 = cumsum_breach_numba(x0, 200_000)
i1 = list(cumsum_breach(x1, 200_000))

assert i0 == i1

Time

%timeit cumsum_breach_numba(x0, 200_000)
%timeit list(cumsum_breach(x1, 200_000))

582 µs ± 40.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
64.3 ms ± 5.66 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Numba was on the order of 100 times faster.

For a more true apples to apples test, I convert a list to a Numpy array

%timeit cumsum_breach_numba(np.array(x1), 200_000)
%timeit list(cumsum_breach(x1, 200_000))

43.1 ms ± 202 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
62.8 ms ± 327 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Which brings them to about even.

edited Jul 05 '19 at 15:03

answered Jul 05 '19 at 14:08

piRSquared

285,575
57
475
624

This should be fast enough. In the case of non-negative array, we can make it faster with break when `total > 10`. – Quang Hoang Jul 05 '19 at 14:11
May speed up by numba :-) BTW happy 7-4 – BENY Jul 05 '19 at 14:11
@piRSquared that is great sir :-) – BENY Jul 05 '19 at 14:23
Hi Sir , could you please testing the speed , would like to know how bad frompyfunc on timing compare with numba . – BENY Jul 05 '19 at 14:27
@piRSquared Convert to array and then feed it to numba. See the magic! – Divakar Jul 05 '19 at 14:58
@Divakar ahh yes. Thank you. Interestingly, The generator version does better with a `list` and Numba better with the `array` – piRSquared Jul 05 '19 at 15:01
@piRSquared The cost of converting to array is not huge. So, it's not bad when summed up. – Divakar Jul 05 '19 at 15:02
@Divakar pls look into this question. I tried to solve by referring above solution. But I could not... https://stackoverflow.com/questions/61346334/groupby-cumulative-in-pandas-then-update-using-numpy-based-specific-condition – Danish Apr 22 '20 at 07:09

Divakar · Accepted Answer · 2019-07-05T16:17:19.667

Here's one with numba and array-initialization -

from numba import njit

@njit
def cumsum_breach_numba2(x, target, result):
    total = 0
    iterID = 0
    for i,x_i in enumerate(x):
        total += x_i
        if total >= target:
            result[iterID] = i
            iterID += 1
            total = 0
    return iterID

def cumsum_breach_array_init(x, target):
    x = np.asarray(x)
    result = np.empty(len(x),dtype=np.uint64)
    idx = cumsum_breach_numba2(x, target, result)
    return result[:idx]

Timings

Including @piRSquared's solutions and using the benchmarking setup from the same post -

In [58]: np.random.seed([3, 1415])
    ...: x = np.random.randint(100, size=1000000).tolist()

# @piRSquared soln1
In [59]: %timeit list(cumsum_breach(x, 10))
10 loops, best of 3: 73.2 ms per loop

# @piRSquared soln2
In [60]: %timeit cumsum_breach_numba(np.asarray(x), 10)
10 loops, best of 3: 69.2 ms per loop

# From this post
In [61]: %timeit cumsum_breach_array_init(x, 10)
10 loops, best of 3: 39.1 ms per loop

Numba : Appending vs. array-initialization

For a closer look at how the array-initialization helps, which seems be the big difference between the two numba implementations, let's time these on the array data, as the array data creation was in itself heavy on runtime and they both depend on it -

In [62]: x = np.array(x)

In [63]: %timeit cumsum_breach_numba(x, 10)# with appending
10 loops, best of 3: 31.5 ms per loop

In [64]: %timeit cumsum_breach_array_init(x, 10)
1000 loops, best of 3: 1.8 ms per loop

To force the output to have it own memory space, we can make a copy. Won't change the things in a big way though -

In [65]: %timeit cumsum_breach_array_init(x, 10).copy()
100 loops, best of 3: 2.67 ms per loop

It probably won't make much of a difference but I think to be entirely fair you should return a copy of `result[:idx]` to avoid leaking the memory of `result[idx:]`. — Paul Panzer, Jul 05 '19 at 15:36
@piRSquared Surely motivated by yours. Think such a array-initialization based one could be used for numba solutions that need to otherwise use appending when the output array size isn't known before-hand. And also, array data works better with numba. So, these two are the learnings from this Q&A. — Divakar, Jul 06 '19 at 21:45

Restart cumsum and get index if cumsum more than value

3 Answers3

Generator Function

Just In Time compiling with Numba

Testing the Two

Linked

Related