What is the time complexity of iterating through a deque in Python?

Question

What is the time complexity of iterating, or more precisely each iteration through a deque from the collections library in Python?

An example is this:

elements = deque([1,2,3,4])
for element in elements:
  print(element)

Is each iteration a constant O(1) operation? or does it do a linear O(n) operation to get to the element in each iteration?

There are many resources online for time complexity with all of the other deque methods like appendleft, append, popleft, pop. There doesn't seem to be any time complexity information about the iteration of a deque.

Thanks!

deque = double ended queue. So finding the nth element is `O(n)` — Jean-François Fabre, Nov 16 '17 at 20:32
Yes. But you can get linear iteration if you do `for element in elements:` — juanpa.arrivillaga, Nov 16 '17 at 20:32
In other words, *iteration over a deque* is linear, but repeatedly indexing into a deque is quadratic. — juanpa.arrivillaga, Nov 16 '17 at 20:33
@juanpa.arrivillaga now I fully understand why `for i in range(len(elements))` is terribly unpythonic. oh and `len(elements)` is probably `O(n)` unless some caching is implemented. — Jean-François Fabre, Nov 16 '17 at 20:34
@Jean-FrançoisFabre So just to clarify, at each iteration, it is doing a O(n) operation to get to element[i]? Even though it is essentially a doubly linked list and each movement to the left or right could be O(1)? — perseverance, Nov 16 '17 at 20:35
@perseverance yes, if *you do the indexing* that's what has to happen. Fortunately, if you want to iterate over the deque, *you don't have to do that*. You almost **never** should be doing `for i in range(len(elements))`. It is merely ugly and error prone with `list` objects that have random-access, but with `deque` objects you'll feel the pain of an O(N) indexing operation. — juanpa.arrivillaga, Nov 16 '17 at 20:36
@juanpa.arrivillaga That is what I am looking for. So at each iteration it is in fact a O(1) operation, and as a whole to get to the end of the iteration, it is a O(n) operation. — perseverance, Nov 16 '17 at 20:36
each iteration (properly done) just takes the next element of the deque, so yes O(1). — Jean-François Fabre, Nov 16 '17 at 20:37
@perseverance: It'd be O(1) per iteration if you were doing it right, but you're doing it wrong. — user2357112, Nov 16 '17 at 20:37
@perseverance No. using `for i in range(len(my_deque)): print(my_deque[i])` will be quadratic. Using `for x in my_deque: print(x)` will be linear — juanpa.arrivillaga, Nov 16 '17 at 20:37
Iterate over the deque directly, not over the indices, as juanpa.arrivillaga already said. — user2357112, Nov 16 '17 at 20:38
Thanks guys! @juanpa.arrivillaga can you please add it as an answer? — perseverance, Nov 16 '17 at 20:39
As the comments have said, the right way to do it is always to iterate through the thing itself: `for elem in elements`. — Daniel Roseman, Nov 16 '17 at 20:39
@Jean-FrançoisFabre I don't believe this a duplicate question because its not about random access in a deque. Its about iterating over a deque. — perseverance, Nov 16 '17 at 20:41
I'm hesitating. It's about wrongly iterating over a deque. I've voted to close, and I won't reopen, but other commenters have the powers to do so, and I don't mind them doing so if they feel it's a different question (it's always nicer when the reopener has the blessing of the closer, not that it's strictly needed but...) — Jean-François Fabre, Nov 16 '17 at 20:42
also note (in the linked question) that deque is smarter than a simple double ended list. Is uses blocks, so it's better than O(n). — Jean-François Fabre, Nov 16 '17 at 20:43
@Jean-FrançoisFabre: It's still O(n). It's just got a better constant factor than if it were a textbook doubly-linked list instead of an unrolled list. — user2357112, Nov 16 '17 at 20:45
@user2357112 thanks for the clarification. My knowledge of complexity is also textbook :) — Jean-François Fabre, Nov 16 '17 at 20:45
I re-opened this, since I do think a question about the complexity of iteration isn't actually answered in the duplicate (and isn't really documented anywhere as far as I can find), so I resorted to empirical testing and checking the CPython source code. My C is very middling, so anyone feel free to correct my interpretation of the source code. — juanpa.arrivillaga, Nov 16 '17 at 22:32
Did nobody link to the documentation yet? https://docs.python.org/2/library/collections.html#collections.deque "Indexed access is O(1) at both ends but slows to O(n) in the middle" — en_Knight, Nov 16 '17 at 23:17
@juanpa.arrivillaga, a possible exception to "almost never": How to iterate over the first 10,000 elements of a 100,000-element deque? I see two bad options: Iterating over a range of indexes (slow) or iterating over the deque while maintaining a counter to stop (ugly code). Given the plot below, I would go with the first option. — Rainald62, Jun 25 '18 at 11:48

score 15 · Accepted Answer · edited Dec 03 '21 at 18:45

If your construction is something like:

elements = deque([1,2,3,4])
for i in range(len(elements)):
    print(elements[i])

You are not iterating over the deque, you are iterating over the range object, and then indexing into the deque. This makes the iteration polynomial time, since each indexing operation, elements[i] is O(n). However, actually iterating over the deque is linear time.

for x in elements:
    print(x)

Here's a quick, empirical test:

import timeit
import pandas as pd
from collections import deque

def build_deque(n):
    return deque(range(n))

def iter_index(d):
    for i in range(len(d)):
        d[i]

def iter_it(d):
    for x in d:
        x

r = range(100, 10001, 100)

index_runs = [timeit.timeit('iter_index(d)', 'from __main__ import build_deque, iter_index, iter_it; d = build_deque({})'.format(n), number=1000) for n in r]
it_runs = [timeit.timeit('iter_it(d)', 'from __main__ import build_deque, iter_index, iter_it; d = build_deque({})'.format(n), number=1000) for n in r]

df = pd.DataFrame({'index':index_runs, 'iter':it_runs}, index=r)
df.plot()

And the resulting plot:

Now, we can actually see how the iterator protocol is implemented for deque objects in CPython source code:

First, the deque object itself:

typedef struct BLOCK {
    struct BLOCK *leftlink;
    PyObject *data[BLOCKLEN];
    struct BLOCK *rightlink;
} block;

typedef struct {
    PyObject_VAR_HEAD
    block *leftblock;
    block *rightblock;
    Py_ssize_t leftindex;       /* 0 <= leftindex < BLOCKLEN */
    Py_ssize_t rightindex;      /* 0 <= rightindex < BLOCKLEN */
    size_t state;               /* incremented whenever the indices move */
    Py_ssize_t maxlen;
    PyObject *weakreflist;
} dequeobject;

So, as stated in the comments, a deque is a doubly-linked list of "block" nodes, where a block is essentially an array of python object pointers. Now for the iterator protocol:

typedef struct {
    PyObject_HEAD
    block *b;
    Py_ssize_t index;
    dequeobject *deque;
    size_t state;          /* state when the iterator is created */
    Py_ssize_t counter;    /* number of items remaining for iteration */
} dequeiterobject;

static PyTypeObject dequeiter_type;

static PyObject *
deque_iter(dequeobject *deque)
{
    dequeiterobject *it;

    it = PyObject_GC_New(dequeiterobject, &dequeiter_type);
    if (it == NULL)
        return NULL;
    it->b = deque->leftblock;
    it->index = deque->leftindex;
    Py_INCREF(deque);
    it->deque = deque;
    it->state = deque->state;
    it->counter = Py_SIZE(deque);
    PyObject_GC_Track(it);
    return (PyObject *)it;
}

// ...

static PyObject *
dequeiter_next(dequeiterobject *it)
{
    PyObject *item;

    if (it->deque->state != it->state) {
        it->counter = 0;
        PyErr_SetString(PyExc_RuntimeError,
                        "deque mutated during iteration");
        return NULL;
    }
    if (it->counter == 0)
        return NULL;
    assert (!(it->b == it->deque->rightblock &&
              it->index > it->deque->rightindex));

    item = it->b->data[it->index];
    it->index++;
    it->counter--;
    if (it->index == BLOCKLEN && it->counter > 0) {
        CHECK_NOT_END(it->b->rightlink);
        it->b = it->b->rightlink;
        it->index = 0;
    }
    Py_INCREF(item);
    return item;
}

As you can see, the iterator essentially keeps track of a block index, a pointer to a block, and a counter of total items in the deque. It stops iterating if the counter reaches zero, if not, it grabs the element at the current index, increments the index, decrements the counter, and tales care of checking whether to move to the next block or not. In other words, A deque could be represented as a list-of-lists in Python, e.g. d = [[1,2,3],[4,5,6]], and it iterates

for block in d:
    for x in block:
        ...

Great answer all around. I expect the graph to be monotonic though - why are their big oscillations in the blue curve? — en_Knight, Nov 16 '17 at 23:14
@en_Knight I could only speculate as to the source, but it's probably mostly noise. — juanpa.arrivillaga, Nov 16 '17 at 23:17
Probably :) Also, I really like citing the implementation, though it may be worth nothing that this behaviour is not implementation specific, it's also cited in the spec — en_Knight, Nov 16 '17 at 23:20
@en_Knight I wasn't able to find anything regarding iteration specifically, but if you drop a link I'll add it to the answer. — juanpa.arrivillaga, Nov 16 '17 at 23:27

What is the time complexity of iterating through a deque in Python?

1 Answers1

Linked

Related