Why is an STL deque not implemented as just a circular vector?

Question

I always thought that in C++ standard template library (STL), a double-ended queue (deque) is a size-variable array (like a vector) with circular boundary conditions, meaning there's a head pointer i and a tail pointer j both pointing to some position of an array a[0..L-1]. A push_front is i--, a push_back is j++, a pop_front is i++, and a pop_back is j--. When either pointer i or j reaches L or -1, it reappears on the other end of the array (0 and L-1 respectively). If the array size gets exhausted (pointers i==j after insering a new element), a larger space with double the original size is reallocated to a[] and data gets copied just like in a vector. There's also O(1) time random access taking into account the circular boundary condition. But someone tells me that inside an STL deque there's actually a pointer array pointing to many fixed-length array segments. It's much more complicated than a circular vector. What's the benefit of not using a simple circular vector to implement the functions of a deque? Will the random access get slower?

The benefit is that erasing items is fast. You don't have to reallocate the whole memory block and copy all the remaining items from the old to the new one. — Violet Giraffe, Sep 05 '16 at 05:04
@VioletGiraffe: You would never have to reallocate for an erase with the structure he is talking about either. Removing from the front or the back would be O(1) (a destruction and an increment or decrement of the head or tail index). Removing from the middle would be O(N), but considering that `deque` is short for double-ended queue, that doesn't seem like it would go against the purpose of the container. — Benjamin Lindley, Sep 05 '16 at 05:10
`deque` is not a circular container, it's like a linked list of vectors . There's a start and a finish — M.M, Sep 05 '16 at 05:13
FWIW, I have exactly the container you describe, and never once have I wanted to use `std::deque` instead. — Benjamin Lindley, Sep 05 '16 at 05:15
@BenjaminLindley: your container does not free its memory when items are removed, that may be undesirable. — Violet Giraffe, Sep 05 '16 at 05:59
@VioletGiraffe: No more undesirable than is the case with `std::vector`, and I see no shortage of people using that. Also, like `std::vector`, it allows you to manually free memory with a `shrink_to_fit`. — Benjamin Lindley, Sep 05 '16 at 06:01
@BenjaminLindley: doesn't `vector` shrink automatically at some point? — Violet Giraffe, Sep 05 '16 at 06:02
@VioletGiraffe: Not automatically. You have to request it. If it ever did, the guarantee made by `reserve` could not be met. — Benjamin Lindley, Sep 05 '16 at 06:03
@Galik seriously? Your comment seems to be wrong. "An insertion at either end of the deque invalidates all the iterators to the deque, but has no effect on the validity of references to elements of the deque. [26.3.8.4/1]" . — Rick, Dec 08 '19 at 14:50

6502 · Answer 1 · 2016-09-05T06:01:46.310

6

The main advantage of the std::deque approach is that elements once inserted in the container are never moved if you add or remove elements from either of the two ends. Thus references (and pointers) to elements are not invalidated when performing those operations (note that, quite surprisingly, iterators to deque elements are instead invalidated when doing insertions or deletions on the ends).

This, while making the implementation more complex, can be done without affecting the formal big-O complexity and makes std::deque a very useful container.

You can have an std::deque of "fat" objects without having to use an extra level of indirection to avoid moving operations and maintain efficiency.

edited Sep 05 '16 at 06:01

answered Sep 05 '16 at 05:35

6502

112,025
15
165
265

@Mehrdad: I personally don't use much iterators, but I'll add a note about it. – 6502 Sep 05 '16 at 05:38
Whoops, sorry, that was misleading. I meant pointers. Iterators are invalidated on insertion at either end, but pointers (and references) aren't. See [here](http://stackoverflow.com/a/914037/541686). – user541686 Sep 05 '16 at 05:41
I was surprised to learn that according to the standard insertion at either end of the `deque` invalidates all the iterators to the `deque`. – Leon Sep 05 '16 at 05:41
Your last edit introduced a wrong statement about iterator invalidation. – Leon Sep 05 '16 at 05:47
Fixed. Like I said I don't use much iterators with `vector`s or `deque`s... the iterator abstraction is not very usable in my opinion. – 6502 Sep 05 '16 at 05:59
Ah, yes. The data items are not moved. Only pointers to their chunks move in the pointer array (or vector). Once an iterator gets invalidated, is there a way to refresh it or do I have to define a new one? – Zhuoran He Sep 05 '16 at 16:01
@ZhuoranHe: I don't think there is any fast way to get the iterator from the pointer. If you need that probably you should keep indexes instead of iterators (direct index access is O(1)), remembering that when pushing/popping on `front()` all indexes change by +1/-1. – 6502 Sep 05 '16 at 16:58
Right. Never use an old iterator when the data structure is changed. This can be a useful reminder. – Zhuoran He Sep 05 '16 at 18:14

score 2 · Accepted Answer · answered Sep 05 '16 at 05:19

2

As cppreference writes

As opposed to std::vector, the elements of a deque are not stored contiguously: typical implementations use a sequence of individually allocated fixed-size arrays.

This means that the large internal reallocations std::vector occasionally does, are not performed by std::deque. When space runs out, only a small fixed-size array is added. (The same but reverse happens when space becomes too large because of erasing.)

Here is a small test:

#include <vector>
#include <deque>
#include <string>
#include <iostream>
#include <chrono>


using namespace std;


int main()
{
    {
        const auto start = chrono::high_resolution_clock::now();

        vector<string> v;
        for(size_t i = 0; i < 9999999; ++i)
            v.push_back(string("hello"));

        cout << chrono::duration_cast<chrono::milliseconds>(chrono::high_resolution_clock::now() - start).count() << endl;
    }

    {
        const auto start = chrono::high_resolution_clock::now();

        deque<string> v;
        for(size_t i = 0; i < 9999999; ++i)
            v.push_back(string("hello"));

        cout << chrono::duration_cast<chrono::milliseconds>(chrono::high_resolution_clock::now() - start).count() << endl;
    }

    return 0;
}

On my machine, it shows deque is twice as fast as vector for this case:

$ ./a.out 
301
164

answered Sep 05 '16 at 05:19

Ami Tavory

74,578
11
141
185

The factor of 2 makes sense. When `push_back` an element to a vector of size `2^n`, the vector needs to double its capacity to `2^(n+1)`, moving all `2^n` elements to the new space and adding the one new element. The total number of assignments all the way from size `0` to size `2^n` is `1+(2^0+1)+(2^1+1+1)+(2^2+1+...+1)+...+(2^(n-1)+1+...+1) = 1+2+2^2+...+2^n = 2^(n+1)-1`, which is about twice of `2^n`. Deque nearly avoids this factor of 2 by moving not the individual data items but the internal pointers of chunks of data in the pointer array. So deque is designed to handle more items. – Zhuoran He Sep 05 '16 at 15:49
Do you know how big is each chunk of data? Is there a parameter that can adjust this size? – Zhuoran He Sep 05 '16 at 15:51
@ZhuoranHe Unfortunately, it's implementation-dependent and opaque - the interface doesn't allow querying it. – Ami Tavory Sep 05 '16 at 16:02
[Stackoverflow link 1](http://stackoverflow.com/questions/4097729/stddeque-memory-usage-visual-c-and-comparison-to-others) says the chunk size is 16 bytes. Every empty string takes 8 bytes in my g++. If it's just 2 strings per chunk, you wouldn't get much performance difference of a deque from a vector. [Stackoverflow link 2](http://stackoverflow.com/questions/5728359/stl-internals-deque-implementation) says 512 bytes and looks more reasonable. – Zhuoran He Sep 05 '16 at 17:54
1

But the frequency of allocations for a linked list of fixed-size arrays is much greater than for a circular vector. Why is it faster to allocate small chunks many times than large chunks fewer times? If anything, I would have thought the other way, since much of the memory allocation cost is per-allocation (regardless of size of the memory being allocated). On top of that, once a queue size stabilizes, a circular vector would not need allocations at all, while the current implementation would always need to allocate/free memory blocks. – max Jul 16 '17 at 23:52
1

@max Those are excellent points, but it's a tradeoff. On the one hand, there are more frequent allocations, on the other hand there's less copying. The only thing to do is to test the alternatives for the specific implementation you have and use case, and see what's faster. As for your last sentence, it indeed sounds like it might make sense, once a queue is stabilized, to copy it into a vector (but, again, per-case test would be necessary). – Ami Tavory Jul 17 '17 at 05:09

score 1 · Answer 3 · answered Sep 05 '16 at 05:37

1

23.3.8.4 [deque.modifiers] (emphasis is mine)

An insertion in the middle of the deque invalidates all the iterators and references to elements of the deque. An insertion at either end of the deque invalidates all the iterators to the deque, but has no effect on the validity of references to elements of the deque.

This is not possible with a circular-vector-like implementation.

answered Sep 05 '16 at 05:37

Leon

31,443
4
72
97

IIUC, with a circular array, you would *not* invalidate iterators when you insert at either end of the `deque` (since iterators can be implemented as simple indexes due to index lookup being `O(1)` with circular arrays). Iterators being effectively a more powerful references, it would seem to actually be a net improvement to go from invalidating iterators but not references to invalidating references but not iterators. – max Jul 16 '17 at 23:27
@max If iterator is implemented as a simple index (+ a reference to the container) then inserting at front would still invalidate, in a milder sense, all the iterators because of the shift in the indexing (previous element #0 now becomes element #1, and so on). – Leon Jul 17 '17 at 06:54

score 0 · Answer 4 · answered Sep 05 '16 at 05:12

std::deque (double-ended queue) is an indexed sequence container that allows fast insertion and deletion at both its beginning and its end. In addition, insertion and deletion at either end of a deque never invalidates pointers or references to the rest of the elements.

As opposed to std::vector, the elements of a deque are not stored contiguously: typical implementations use a sequence of individually allocated fixed-size arrays.

The storage of a deque is automatically expanded and contracted as needed. Expansion of a deque is cheaper than the expansion of a std::vector because it does not involve copying of the existing elements to a new memory location.

The complexity (efficiency) of common operations on deques is as follows:

Random access - constant O(1)
Insertion or removal of elements at the end or beginning - constant O(1)
Insertion or removal of elements - linear O(n)

Source : std::deque

Why is an STL deque not implemented as just a circular vector?

4 Answers4

Linked