15

Summary

Suppose I have an iterator that, as elements are consumed from it, performs some side effect, such as modifying a list. If I define a list l and call l.extend(iterator), is it guaranteed that extend will push elements onto l one-by-one, as elements from the iterator are consumed, as opposed to kept in a buffer and then pushed on all at once?

My experiments

I did a quick test in Python 3.7 on my computer, and list.extend seems to be lazy based on that test. (See code below.) Is this guaranteed by the spec, and if so, where in the spec is that mentioned?

(Also, feel free to criticize me and say "this is not Pythonic, you fool!"--though I would appreciate it if you also answer the question if you want to criticize me. Part of why I'm asking is for my own curiosity.)

Say I define an iterator that pushes onto a list as it runs:

l = []

def iterator(k):
  for i in range(5):
    print([j in k for j in range(5)])
    yield i

l.extend(iterator(l))

Here are examples of non-lazy (i.e. buffered) vs. lazy possible extend implementations:

def extend_nonlazy(l, iterator):
  l += list(iterator)

def extend_lazy(l, iterator):
  for i in iterator:
    l.append(i)

Results

Here's what happens when I run both known implementations of extend.


Non-lazy:

l = []
extend_nonlazy(l, iterator(l))
# output
[False, False, False, False, False]
[False, False, False, False, False]
[False, False, False, False, False]
[False, False, False, False, False]
[False, False, False, False, False]

# l = [0, 1, 2, 3, 4]

Lazy:

l = []
extend_lazy(l, iterator(l))
[False, False, False, False, False]
[True, False, False, False, False]
[True, True, False, False, False]
[True, True, True, False, False]
[True, True, True, True, False]

My own experimentation shows that native list.extend seems to work like the lazy version, but my question is: does the Python spec guarantee that?

Kye W Shi
  • 250
  • 3
  • 6
  • 1
    Related: the [docs](https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types) describe `extend` as "for the most part the same as `s[len(s):len(s)] = t`", but the slice assignment is not lazy. – user2357112 Jul 25 '19 at 04:08
  • @user2357112 Cool--I just did a test, and `s[len(s):len(s)] = iterator(s)` behaves like the non-lazy version. – Kye W Shi Jul 25 '19 at 04:12
  • No, I do not believe this is a language guarantee – juanpa.arrivillaga Jul 25 '19 at 04:19
  • 3
    There is no Python specification. The closest thing, The Python Language Reference, I do not believe clarifies it. In absence of the formal description, CPython is the de-facto reference implementation, and it definitely inserts elements one by one as the iterable is consumed, unless the iterable is a tuple or a list (in which case it short-circuits the process for efficiency). ([Use the Source, Luke!](https://github.com/python/cpython/blob/196a530e00d88a138973bf9182e013937e293f97/Objects/listobject.c#L873)) – Amadan Jul 25 '19 at 04:26
  • Would `is l.extend(...) atomic wrt. l` be more accurate title? – Dima Tisnek Aug 22 '19 at 01:24

2 Answers2

7

I don't think the issue is lazy vs non lazy because, either in slice assignment or in list extend, you need all the elements of the iterator and these elements are consumed at once (in the normal case). The issue you raised is more important: are these operations atomic or not atomic? See one definition of "atomicity" in Wikipedia:

Atomicity guarantees that each transaction is treated as a single "unit", which either succeeds completely, or fails completely.

Have a look at this example (CPython 3.6.8):

>>> def new_iterator(): return (1/(i-2) for i in range(5))
>>> L = []
>>> L[:] = new_iterator()
Traceback (most recent call last):
...
ZeroDivisionError: division by zero
>>> L
[]

The slice assignment failed because of the exception (i == 2 => 1/(i - 2) raises an exception) and the list was left unchanged. Hence, the slice assignement operation is atomic.

Now, the same example with: extend:

>>> L.extend(new_iterator())
Traceback (most recent call last):
...
ZeroDivisionError: division by zero
>>> L
[-0.5, -1.0]

When the exception was raised, the two first elements were already appended to the list. The extend operation is not atomic, since a failure does not leave the list unchanged.

Should the extend operation be atomic or not? Frankly I have no idea about that, but as written in @wim's answer, the real issue is that it's not clearly stated in the documentation (and worse, the documentation asserts that extend is equivalent to the slice assignment, which is not true in the reference implementation).

jferard
  • 7,835
  • 2
  • 22
  • 35
  • 1
    I have said in another answer that `l.extend(iterable)` was the same as `for item in iterable: l.append(item)` except for where the loop is done. At least for the list type. This answer reveals a property that I wasn't considering when I said that. – Dan D. Jul 25 '19 at 16:42
  • 1
    Thanks for the clarification on "atomicity"--laziness was the best word that I could think of, so that's the best way I knew to describe it. – Kye W Shi Jul 25 '19 at 17:18
  • @KyeWShi your question was clear enough to be upvoted a lot! – jferard Jul 26 '19 at 04:27
1

Is Python list.extend(iterator) guaranteed to be lazy?

No. On the contrary, it's documented that

l.extend(iterable)

is equivalent to

l[len(l):] = iterable

In CPython, such a slice assignment will first convert a generator on the right hand side into a list anyway (see here), i.e. it's consuming the iterable all at once.

The example shown in your question is, strictly speaking, contradicting the documentation. I've filed a documentation bug, but it was promptly closed by Raymond Hettinger.

As an aside, there are less convoluted ways to demonstrate the discrepancy. Just define a failing generator:

def gen():
    yield 1
    yield 2
    yield 3
    uh-oh

Now L.extend(gen()) will modify L, but L[:] = gen() will not.

wim
  • 338,267
  • 99
  • 616
  • 750
  • Huh how come they didn't think of `l += iterable`? – U13-Forward Jul 25 '19 at 04:44
  • 1
    Because that is an assignment statement, and it [behaves completely differently than a list extend when used in a local scope](https://repl.it/@wimglenn/ForestgreenAssuredEngineers). – wim Jul 25 '19 at 04:54