Cumulative maximum of any iterable with a list comprehension

Question

Another question asked to compute cumulative maximum of a list in a list comprehension, e.g.:

input:  [3, 4, 2, 8, 9, 3, 3, 4, 20, 1]
output: [3, 4, 4, 8, 9, 9, 9, 9, 20, 20]

I came up with a decent list comprehension there as requested, but it uses that the input is given as a list:

[m
 for m in list_input[:1]
 for x in list_input
 for m in [max(m, x)]]

My question/challenge: What if the input is an iterator or any other iterable? Should still be just a list comprehension and take only linear time. And of course not using accumulate(iterable, max). I found one and I like it, but would like to see other people's solutions as well :-)

Testing code, leaving cumulative_maximum to implement (Try it online!):

from itertools import accumulate

def cumulative_maximum(iterable):
    return [your list comprehension here]

def test(iterable):
    iterable = tuple(iterable)
    expect = list(accumulate(iterable, max))
    output = cumulative_maximum(iterable)
    print(output == expect, output)
    output = cumulative_maximum(iter(iterable))
    print(output == expect, output)

test([3, 4, 2, 8, 9, 3, 3, 4, 20, 1])
test([])
test([3])
test([3, 1])

if we can use `tee`, then I think your original list comp works with minimal adjustment — , Apr 11 '22 at 20:15
@enke Yes, that would be alright. I really just rule out `accumulate`, as that pretty much does the whole job and isn't interesting at all. — Kelly Bundy, Apr 11 '22 at 20:20
`reduce(lambda x, r: x.append(max(r, x[-1])) or x, lst[1:], lst[:1])` :) No list comprehension, though — njzk2, Apr 11 '22 at 21:11
@njzk2 More importantly, it misses the point of the question. Even a *list* iterator can't be sliced like that. — Kelly Bundy, Apr 11 '22 at 21:15
well, it's just a way of writing _first_ and _rest_, avoiding ifs and elses, until python supports pattern matching for iterators as well. This does the same and works for iterables, though: `reduce(lambda x, r: x.append(x[-1] if x and x[-1] > r else r) or x, lst, [])` — njzk2, Apr 11 '22 at 21:47
I was messing around and it seems `first, second = tee(iterable); return [m for m in [next(first, [])] for x in second for m in [m if m>x else x]]` is 30% faster than `list(accumulate(iterable, max))`. I thought `accumulate` was optimized for this job. How is it slower? Also the equivalent numpy method seems to perform better than `accumulate` even on lists too. — , Apr 12 '22 at 00:37
@enke `accumulate` probably *is* faster if your list comp uses `max(m, x)`, no? — Kelly Bundy, Apr 12 '22 at 01:21
@enke `tee` slows it down a bit, but not by much. See benchmarks in my answer. — Kelly Bundy, Apr 13 '22 at 16:42

Kraigolas · Answer 1 · 2022-04-11T19:25:04.153

2

If we can use the walrus operator, this works:

return [m := x if idx == 0 else max(m, x) for idx, x in enumerate(iterable)]

enumerate, if, and else are strictly for initializing m.

This is a Python 3.8+ solution.

edited Apr 11 '22 at 19:25

answered Apr 11 '22 at 19:19

Kraigolas

5,121
3
12
37

1

Nice. Leaks the `m` variable to outside of the list comprehension, but that outside is just the function and it returns right away, so no issue. – Kelly Bundy Apr 11 '22 at 19:37

Kelly Bundy · Answer 2 · 2022-04-13T16:50:11.903

Here's mine and benchmarks:

def cumulative_maximum(iterable):
    return [m
            for it in [iter(iterable)]
            for m in it
            for xs in [[m], it]
            for x in xs
            for m in [max(m, x)]]

First I make sure I have an iterator, calling it it. Then I initialize my maximum variable m with the first element (if there isn't one, we just stop right there). Then the iterator it only has the remaining elements, so I "chain" m back in front of it by going through xs = [m] and afterwards going through xs = it. And for each I go through the x-values and update/output m.

Benchmark on iterable = iter([random.random() for _ in range(1000)]):

Python 3.10.4:
 35.9 us   35.9 us   35.9 us  Kelly2
 42.3 us   42.4 us   42.5 us  enke
 43.3 us   43.4 us   43.5 us  loop2
 55.8 us   55.8 us   56.5 us  loop
 63.6 us   63.6 us   63.7 us  Kraigolas2
 74.8 us   74.9 us   75.2 us  accumulate_lambda
106.1 us  106.4 us  106.5 us  accumulate_max
120.6 us  120.7 us  120.7 us  Kelly
161.8 us  162.4 us  162.4 us  Kraigolas

Python 3.9.2:
 39.5 us   39.6 us   39.6 us  Kelly2
 45.9 us   46.1 us   46.2 us  enke
 47.4 us   47.6 us   47.6 us  loop2
 55.5 us   55.6 us   55.6 us  loop
 65.6 us   65.7 us   65.7 us  Kraigolas2
 74.4 us   74.6 us   74.6 us  accumulate_lambda
 93.0 us   93.1 us   93.2 us  accumulate_max
108.7 us  108.9 us  109.0 us  Kelly
143.8 us  143.8 us  143.8 us  Kraigolas

Note:

My above solution is my slower one, the fast Kelly2 version replaced max(m, x) with the faster x if m < x else m.
@enke's suggestion with tee is a little slower than mine because it iterates through a tee iterator whereas mine iterates directly on the (iterator of the) given iterable. Also, it has the downside of additionally buffering all but the first element in memory in another structure, because the first tee iterator stays at the start while the second runs through.
The accumulate and loop solutions don't use list comprehensions, but I included them anyway out of curiosity / for reference.
The "idiom for assignment a temporary variable in comprehensions" that I'm using got optimized in Python 3.9.

Full benchmark code:

def cumulative_maximum_Kelly(iterable):
    return [m
            for it in [iter(iterable)]
            for m in it
            for xs in [[m], it]
            for x in xs
            for m in [max(m, x)]]

def cumulative_maximum_Kelly2(iterable):
    return [m
            for it in [iter(iterable)]
            for m in it
            for xs in [[m], it]
            for x in xs
            for m in [x if m < x else m]]

def cumulative_maximum_Kraigolas(iterable):
    return [m := x if idx == 0 else max(m, x) for idx, x in enumerate(iterable)]

def cumulative_maximum_Kraigolas2(iterable):
    return [m := x if not idx else x if m < x else m for idx, x in enumerate(iterable)]

def cumulative_maximum_enke(iterable):
    return [m
            for first, second in [tee(iterable)]
            for m in [next(first, [])]
            for x in second
            for m in [m if m > x else x]]

def cumulative_maximum_accumulate_max(iterable):
    return list(accumulate(iterable, max))

def cumulative_maximum_accumulate_lambda(iterable):
    return list(accumulate(iterable, lambda m, x: x if m < x else m))

def cumulative_maximum_loop(iterable):
    result = []
    it = iter(iterable)
    for m in it:
        result.append(m)
        for x in it:
            if m < x:
                m = x
            result.append(m)
    return result

def cumulative_maximum_loop2(iterable):
    result = []
    append = result.append
    it = iter(iterable)
    for m in it:
        append(m)
        for x in it:
            if m < x:
                m = x
            append(m)
    return result

funcs = [
    cumulative_maximum_Kelly,
    cumulative_maximum_Kelly2,
    cumulative_maximum_enke,
    cumulative_maximum_Kraigolas,
    cumulative_maximum_Kraigolas2,
    cumulative_maximum_accumulate_max,
    cumulative_maximum_accumulate_lambda,
    cumulative_maximum_loop,
    cumulative_maximum_loop2,
]

from timeit import repeat
from random import random
from itertools import tee, accumulate
import sys

tss = [[] for _ in funcs]
for _ in range(10):
    lst = [random() for _ in range(1000)]
    expect = funcs[0](iter(lst))
    for func, ts in zip(funcs, tss):
        assert func(iter(lst)) == expect
        t = min(repeat(lambda: func(iter(lst)), number=100)) / 100
        ts.append(t)
    for func, ts in sorted(zip(funcs, tss), key=lambda func_ts: sorted(func_ts[1])):
        print(*('%5.1f us ' % (t * 1e6) for t in sorted(ts)[:3]),
              func.__name__.removeprefix('cumulative_maximum_'))
    print()
print(sys.version)

Cumulative maximum of any iterable with a list comprehension

2 Answers2