15

I occasionally use the "trick" to extend a list by a mapped version of itself, for example to efficiently compute powers of 2:

from operator import mul

powers = [1]
powers += map(mul, [2] * 10, powers)

print(powers)   # prints [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]

This relies on the += immediately appending each value from map to the list, so that the map then finds it and the procedure continues. In other words, it needs to work like this:

powers = [1]
for value in map(mul, [2] * 10, powers):
    powers.append(value)

And not first compute and store the whole right-hand side like this, where powers ends up being [1, 2]:

powers = [1]
powers += list(map(mul, [2] * 10, powers))

Is it somewhere guaranteed that it works like it does? I checked the Mutable Sequence Types documentation but it doesn't say much about it other than implying equivalence of s += t and s.extend(t). It does refer to MutableSequence, whose source code includes this:

    def extend(self, values):
        'S.extend(iterable) -- extend sequence by appending elements from the iterable'
        if values is self:
            values = list(values)
        for v in values:
            self.append(v)
    def __iadd__(self, values):
        self.extend(values)
        return self

This does suggest that it's indeed supposed to work like it does and like I want it, but some source code being what it is doesn't feel as safe as a guarantee in the documentation.

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
Kelly Bundy
  • 23,480
  • 7
  • 29
  • 65
  • 5
    This is not a behaviour of the list object; it is a behaviour of the `+=` operator. Yes, `+=` on a list, as a special case, uses the `.extend` method - modifying the original object, and therefore enabling the `map`'s internal iteration process to find more elements. Yes, the `+=` behaviour is documented and guaranteed. No, please do not write code like this. Elements [should not be added or removed from a sequence while iterating over it](/questions/742371/why-does-python-skip-elements-when-i-modify-a-list-while-iterating-over-it). – Karl Knechtel May 07 '22 at 18:47
  • 1
    "it doesn't say much about it other than implying equivalence of `s += t` and `s.extend(t)`" The observed result follows directly from that equivalence, and I think it's pretty overtly stated. – Karl Knechtel May 07 '22 at 18:52
  • wow, that is a weird trick! and I always assumed `extend()` maybe did something, I don't know, more efficient than just appending the items one-by-one. so I learned two things today! – Anentropic May 07 '22 at 18:59
  • The answer I linked details the "`extend` is implemented in terms of `__iadd__`" behaviour. – Karl Knechtel May 07 '22 at 19:01
  • @Anentropic Well, what could be more efficient? If it did for example append the items to a temporary extra list, then you'd have the same costs for appending there. And then the extra cost of afterwards moving them to the target list. – Kelly Bundy May 07 '22 at 19:01
  • @Anentropic the code shown there is not a part of the implementation of the built-in `list` (which probably does that work at the C level, but it does necessarily involve iteration). It is a *mixin* method - fallback code for user-defined classes that want to support the full `MutableSequence` interface. Please see the [collections.abc documentation](https://docs.python.org/3/library/collections.abc.html) for more information, and [PEP 3119](https://peps.python.org/pep-3119/) for design rationale. – Karl Knechtel May 07 '22 at 19:03
  • "And that tells me what?" Sorry; if it doesn't answer the question then I don't understand how there is a question here. I reiterate my first two comments. I'll reopen, though. – Karl Knechtel May 07 '22 at 19:04
  • If you're looking for an official statement of "yes, the code is supposed to do what you observe, and will be guaranteed to do so in future versions barring deprecation notices" or something like that, my suggestion is to try the Python mailing list. – Karl Knechtel May 07 '22 at 19:07
  • 1
    @KarlKnechtel I'm looking for something in the documentation specifying the behavior of `+=`/`extend`. Ideally something like how the order of `dict.keys` and `dict.values` were guaranteed to correspond, although it doesn't need to be as in-your-face as that. Your first comment shows you understand the issue, but I don't think your comments address the question. – Kelly Bundy May 07 '22 at 19:13
  • "something in the documentation specifying the behavior of `+=`/`extend`. " Is `extends s with the contents of t (for the most part the same as s[len(s):len(s)] = t)` insufficient? – Karl Knechtel May 07 '22 at 19:17
  • @KarlKnechtel Yes, that's insufficient. It doesn't rule out that it behaves like `powers += list(map(mul, [2] * 10, powers))`, does it? – Kelly Bundy May 07 '22 at 19:18
  • Oh. If *that's* the point of contention, it's a consequence of `map` being lazy. The approach taken by the `MutableSequence` mixin code will have the same effect. – Karl Knechtel May 07 '22 at 19:21
  • 4
    @KarlKnechtel I know *map* is lazy, but that doesn't prevent the code that implements the `+=` from fully consuming it before appending the values to the list. – Kelly Bundy May 07 '22 at 19:23
  • Hmm. You have a point there. – Karl Knechtel May 07 '22 at 19:26
  • Maybe not relevant to OP, but why not use list comprehension instead which much more clearly expresses what you are trying to do? For example: [2**i for i in range(11)] – saltthehash May 08 '22 at 06:31
  • @saltthehash Like I said: *"to **efficiently** compute powers of 2"*. Mine is over 3 times faster than your list comp for going up to 2^10, and about [16 times faster](https://tio.run/##TY/RasMwDEXf/RV6KbXTEOaEwRjkS0oJ2aZshlo2ikLp12d242zVky2de3UV7/ITqHuLvK4TBw9OkCWE6wzOx8ACjBFHUY9hiMijBN5nfrmq8hTn0YlSXzhBDDfk2WrzriDV9oUezvby3Dj14Meok0ldtui2BjJ1AcwDZpSFqbSe/dvdvxDntqocTDkdOAIe6Rs1wQmsuShFab99SaUyMfwTXXHJ7WmhzzwpB@xB2o3IJcnGO9LbuU2JnXUp@eI/kHtrzB8e2ZHo4@G1sRP4GY5wAC1QgcUu3Zl1zTDQ6HEYNtWmMOv6Cw) for going up to 2^10000. – Kelly Bundy May 08 '22 at 12:10
  • @saltthehash (continuing) Also, in other cases you just might not have a nice "closed form" expression for each individual element, but need to compute each element from the previous one. This doesn't happen often, and I don't do this often, just like I said "occasionally". In those cases where it *is* advantageous. – Kelly Bundy May 08 '22 at 12:15

1 Answers1

7

I don't see any tests or docs that the greedy behavior is guaranteed; however, I do think it is the expected behavior and that code in the wild relies on it.

FWIW, += with lists is equivalent to list.extend(), so your "trick" boils down to:

>>> powers = [1]
>>> powers.extend(2*x for x in islice(powers, 10))
>>> powers
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]

While I haven't found a guarantee for += or extend, we do have a guarantee that the list iterator allows mutation while iterating.¹ So, this code is on firm ground:

>>> powers = [1]
>>> for x in powers:
        if len(powers) == 10:
            break
        powers.append(2 * x)

>>> powers
[1, 2, 4, 8, 16, 32, 64, 128, 256, 512]

¹ See the second paragraph following the table at: https://docs.python.org/3/library/stdtypes.html#common-sequence-operations:

Forward and reversed iterators over mutable sequences access values using an index. That index will continue to march forward (or backward) even if the underlying sequence is mutated. The iterator terminates only when an IndexError or a StopIteration is encountered (or when the index drops below zero).

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
  • 2
    One of the few people from whom *"I don't know of a guarantee"* is a useful statement :-). Where can I find that other guarantee that you mention? And what do you think: Given that it's expected behavior and that probably code in the wild relies on it, is it safe to assume it can be relied on? And what do you think about Karl's ["should not"](https://stackoverflow.com/questions/72155476/is-this-greedy-behavior-of-lists-guaranteed/72155954#comment127489363_72155476)? When I do use this, it's for short/fast code, and I do like using it and disagree that I shouldn't. – Kelly Bundy May 07 '22 at 20:07
  • Same last question about your "on firm ground" code: Is that something that one "should not" do? (I don't think so, I think it's ok.) – Kelly Bundy May 07 '22 at 20:28
  • 2
    Mutating lists while iterating is something that is done. Should or should not Is a personal decision. The usual rule is to stick with simple and obvious code unless there is a some practical reason to need less conventional or more tricky code. If the need for a hack arises, it at least warrants a comment to explain what it does and why it is preferred. – Raymond Hettinger May 07 '22 at 22:40
  • 1
    Yes, "personal decision" is my stance as well. I'm fine with people not wanting to do it, but I disagree that that means *I* shouldn't, either. I suspect the *"don't modify a list while iterating it"* mantra is just the result of having seen too many *wrong* ways of doing so, like in the question Karl referenced. And now that mantra gets echoed over and over. Anyway, I now took your last bit of advice and added my rationale for using this to my latest [answer](https://stackoverflow.com/a/72154275/12671057) (which inspired me to finally ask about this). – Kelly Bundy May 07 '22 at 23:05