How to sum variable-size parts of a collection?

Question

I want to calculate the sum of a collection, for sections of different sizes:

d = (1, 2, 3, 4, 5, 6, 7, 8, 9)
sz = (2, 3, 4)

# here I expect 1+2=3, 3+4+5=12, 6+7+8+9=30

itd = iter(d)
result = tuple( sum(tuple(next(itd) for i in range(s))) for s in sz )

print("result = {}".format(result))

I wonder whether the solution I came up with is the most 'pythonic' (elegant, readable, concise) way to achieve what I want...

In particular, I wonder whether there is a way to get rid of the separate iterator 'itd', and whether it would be easier to work with slices?

Maybe I am misled, but I like to have expressions that are not meant to change to be immutables... Like I would use a `const` expression in `C`, both for readability and to support optimization by the compiler. I have read the thread about homogeneous vs. heterogeneous contents but am not convinced. That is a bit like mixing up array and list in other languages with immutability (const vs. non-const). But the inner tuple should for sure be omitted. — user52366, Feb 21 '18 at 13:32
Would a [definitive pronouncement by the BDFL](https://mail.python.org/pipermail/python-dev/2003-March/033964.html) help to convince you? Or [another, even more definitive one](https://mail.python.org/pipermail/python-dev/2003-March/033972.html) from the same discussion? — Zero Piraeus, Feb 21 '18 at 13:48
Thanks for the link. I understand part of the reasoning but it still seems weird to me to rank semantics above some hard-coded (and I think useful) feature like immutability. — user52366, Feb 22 '18 at 07:52

score 2 · Answer 1 · answered Dec 23 '17 at 21:04

There's no reason to get rid of your iterator – iterating over d is what you are doing, after all.

You do seem to have an overabundance of tuples in that code, though. The line that's doing all the work could be made more legible by getting rid of them:

it = iter(d)
result = [sum(next(it) for _ in range(s)) for s in sz]
# [3, 12, 30]

… which has the added advantage that now you're producing a list rather than a tuple. d and sz also make more sense as lists, by the way: they're variable-length sequences of homogeneous data, not fixed-length sequences of heterogeneous data.

Note also that it is the conventional name for an arbitrary iterator, and _ is the conventional name for any variable that must exist but is never actually used.

Going a little further, next(it) for _ in range(s) is doing the same work that islice() could do more legibly:

from itertools import islice

it = iter(d)
result = [sum(islice(it, s)) for s in sz]
# [3, 12, 30]

… at which point, I'd say the code's about as elegant, readable and concise as it's likely to get.

I ended up writing your `islice` without reading yours first ;-) +1 — dawg, Dec 23 '17 at 23:17
the islice part improved readability I think. I used a tuple because I considered the result immutable. — user52366, Feb 21 '18 at 13:27
The most important distinction between tuples and lists is not mutable vs immutable, but order vs structure: see [Lists vs. Tuples](https://nedbatchelder.com/blog/201608/lists_vs_tuples.html) by Ned Batchelder for more on this. A list is the semantically correct data structure to use here. — Zero Piraeus, Feb 21 '18 at 13:40

score 2 · Accepted Answer · answered Dec 23 '17 at 23:15

I would use itertools.islice since you can directly use the values in sz as the step size at each point:

>>> from itertools import islice
>>> it=iter(d)
>>> [sum(islice(it,s)) for s in sz]
[3, 12, 30]

Then you can convert that to a tuple if needed.

The iter is certainly needed in order to step through the tuple at the point where the last slice left off. Otherwise each slice would be d[0:s]

How to sum variable-size parts of a collection?

2 Answers2