The C++ Standard (N4901) says this with reference to the deque push_front function:
An implementation shall provide these operations for all container types shown in the “container” column, and shall implement them so as to take amortized constant time.
If I am not mistaken, this means that a standards-compliant implementation should provide a deque push_front function that takes amortized O(1) time.
However, after looking at the source code and after running some test programs under gdb, I have come to believe that the libc++ implementation of the deque push_front function in fact has O(log n) amortized time.
For context: The STL deque is implemented in libc++ using a two-level array, the map
is a dynamic array that contains pointers to fixed-sized arrays called blocks
. There is only one map
, which is unbounded in size, and an unbounded number of blocks
, which are fixed in size.
Here is an accurate visualization (for proof, see below) of how the map
looks after each new block
(the blocks
are named in order of allocation, so b01 was the first block
that was allocated, b02 is the second block
that was allocated, and so on) is added to the map
by calls to push_front
. Linear time operations such as copying or shifting the entire map
array via memmove
are labeled:
[]
[b01]
[b02, b01] copied
[b03, b02, b01, ___] copied
[b04, b03, b02, b01] shifted by 1
[b05, b04, b03, b02, b01, ___, ___, ___] copied
[___, b06, b05, b04, b03, b02, b01, ___] shifted by 2
[b07, b06, b05, b04, b03, b02, b01, ___]
[b08, b07, b06, b05, b04, b03, b02, b01] shifted by 1
[b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___] copied
[___, ___, ___, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___] shifted by 4
[___, ___, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___]
[___, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___]
[b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___]
[___, b14, b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___] shifted by 2
[b15, b14, b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___]
[b16, b15, b14, b13, b12, b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01] shifted by 1
Proof: I used this program to look at when blocks are added to the map
in libc++ and to examine the contents of the map
in memory:
#include <deque>
long mylong = 13370000;
void my_pf(std::deque<long>& d){
d.push_front(mylong);
++mylong;
}
void pushn(std::deque<long>& d, int n){
for (int i = 0; i < n; i++){
my_pf(d);
}
}
int main()
{
std::deque<long> d = {};
// zero blocks
pushn(d, 256);
// one block
my_pf(d);
// two blocks
pushn(d, 511);
// still two blocks
my_pf(d);
// three blocks
pushn(d, 511);
// still three blocks
my_pf(d); // 4th block is added
pushn(d, 511);
my_pf(d); // 5th block is added
pushn(d, 511);
my_pf(d); // 6th block is added
pushn(d, 511);
my_pf(d); // 7th block is added
pushn(d, 511);
my_pf(d); // 8th block is added
pushn(d, 511);
my_pf(d); // 9th block is added
pushn(d, 511);
my_pf(d); // 10th block is added
pushn(d, 511);
my_pf(d); // 11th block is added
pushn(d, 511);
my_pf(d); // 12th block is added
pushn(d, 511);
my_pf(d); // 13th block is added
pushn(d, 511);
my_pf(d); // 14th block is added
pushn(d, 511);
my_pf(d); // 15th block is added
pushn(d, 511);
my_pf(d); // 16th block is added
pushn(d, 511);
my_pf(d); // 17th block is added
}
Here I dump the contents of the map in gdb after every new block is added, starting at the first block:
(gdb) print __map_.capacity()
$3 = 1
(gdb) print *__map_.__first_@1
$4 = {0x4092c0}
(gdb) print __map_.capacity()
$1 = 2
(gdb) print *__map_.__first_@2
$2 = {0x40a2f0, 0x4092c0}
(gdb) print __map_.capacity()
$3 = 4
(gdb) print *__map_.__first_@4
$4 = {0x40b330, 0x40a2f0, 0x4092c0, 0x0}
(gdb) print __map_.capacity()
$5 = 4
(gdb) print __map_
$1 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40b300, __begin_ = 0x40b300,
__end_ = 0x40b320, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40b320}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@4
$6 = {0x40c340, 0x40b330, 0x40a2f0, 0x4092c0}
(gdb) print __map_
$2 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d378, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@8
$8 = {0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
(gdb) print __map_
$11 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d358,
__end_ = 0x40d388, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@8
$10 = {0x40d3a0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}
(gdb) print __map_
$3 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d388, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@8
$4 = {0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}
(gdb) print __map_
$5 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x40d350, __begin_ = 0x40d350,
__end_ = 0x40d390, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x40d390}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@8
$6 = {0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0}
(gdb) print __map_
$7 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e0,
__end_ = 0x411428, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print __map_.capacity()
$8 = 16
(gdb) print *__map_.__first_@16
$9 = {0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0, 0x0, 0x0,
0x0, 0x0}
(gdb) print __map_
$10 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113f8,
__end_ = 0x411448, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$11 = {0x411470, 0x4103d0, 0x40f3c0, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330,
0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
(gdb) print __map_
$12 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113f0,
__end_ = 0x411448, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$13 = {0x411470, 0x4103d0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330,
0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
(gdb) print __map_
$14 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e8,
__end_ = 0x411448, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$15 = {0x411470, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330,
0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
(gdb) print __map_
$16 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e0,
__end_ = 0x411448, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$17 = {0x4154b0, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0, 0x40c340, 0x40b330,
0x40a2f0, 0x4092c0, 0x0, 0x0, 0x0}
(gdb) print __map_
$18 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e8,
__end_ = 0x411458, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$19 = {0x4154b0, 0x4164c0, 0x4154b0, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0,
0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}
(gdb) print __map_
$20 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e0,
__end_ = 0x411458, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$21 = {0x4174d0, 0x4164c0, 0x4154b0, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0, 0x40d3a0,
0x40c340, 0x40b330, 0x40a2f0, 0x4092c0, 0x0}
(gdb) print __map_
$22 = {<std::__1::__split_buffer_common<true>> = {<No data fields>}, __first_ = 0x4113e0, __begin_ = 0x4113e0,
__end_ = 0x411460, __end_cap_ = {<std::__1::__compressed_pair_elem<long**, 0, false>> = {
__value_ = 0x411460}, <std::__1::__compressed_pair_elem<std::__1::allocator<long*>, 1, true>> = {<std::__1::allocator<
long*>> = {<No data fields>}, <No data fields>}, <No data fields>}}
(gdb) print *__map_.__first_@16
$23 = {0x4184e0, 0x4174d0, 0x4164c0, 0x4154b0, 0x4144a0, 0x413490, 0x412480, 0x411470, 0x4103d0, 0x40f3c0, 0x40e3b0,
0x40d3a0, 0x40c340, 0x40b330, 0x40a2f0, 0x4092c0}
For comparison, here is the same visualization (for proof, see Appendix C) for the libstdc++ deque implementation. Linear time operations such as copying or shifting the entire map
array are labeled:
[___, ___, ___, b01, ___, ___, ___, ___]
[___, ___, b02, b01, ___, ___, ___, ___]
[___, b03, b02, b01, ___, ___, ___, ___]
[b04, b03, b02, b01, ___, ___, ___, ___]
[___, ___, ___, ___, ___, ___, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___] copied
[___, ___, ___, ___, ___, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]
[___, ___, ___, ___, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]
[___, ___, ___, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]
[___, ___, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]
[___, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]
[b11, b10, b09, b08, b07, b06, b05, b04, b03, b02, b01, ___, ___, ___, ___, ___, ___, ___]
[___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, b12, b11, ..., b01, ___x13] copied
[___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, ___, ___, ___, ___, ___, ___, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, ___, ___, ___, ___, ___, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, ___, ___, ___, ___, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, ___, ___, ___, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, ___, ___, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, ___, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, ___, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, ___, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, ___, b22, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, ___, b23, b22, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[___, b24, b23, b22, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
[b25, b24, b23, b22, b21, b20, b19, b18, b17, b16, b15, b14, b13, b12, b11, ..., b01, ___x13]
Time complexity analysis
For the libc++ deque:
Each copy and each shift is a linear time operation, and since the number of shifts clearly dominates the number of copies, we will only look at the cost of the shifts.
To derive a lower bound we will only look at the shifts that happen after the last doubling of the map
array.
Shifts are done using the memmove
instruction which is a linear time operation. The cost of a shift (asymptotically) is directly proportional to the number of elements that are shifted.
Just after the map
is resized, its capacity is n and the number of elements inside it is n/2 + 1
. When another block
is added to it, the elements inside it are shifted by half of the remaining distance from the last element to the end of the map
array, that is to say, it is shifted by half of n/2 - 1
, rounded up. This leaves exactly n/4 - 1
free slots at both ends of the map
. When the free space at the front is filled up, the elements are shifted again, by the same formula as before.
That is to say, each time the shift happens, the amount of free space between the last element and the end of the array is reduced by half (rounded up). The total number of shifts, therefore, is on the order of log2(n/2)
which is equal to log2(n) - 1
.
So now that we have the number of shifts, we just need to calculate the cost of each shift.
The first shift moves n/2 + 1
elements, the second shift moves n/2 + n/4 + 1
, the third shift moves n/2 + n/4 + n/8 + 1
elements and so on. To calculate a lower bound, we'll simply assume that each shift moves exactly n/2
elements.
So we have: the number of shifts is log2(n) - 1
and each shift has a cost of n/2
.
The total cost is:
(log2(n) - 1) * n/2
= 0.5 * n * (log2(n) - 1)
= 0.5 * n * log2(n) - 0.5 * n
To compute the amortized time complexity of the push_front
operation, we divide the total cost by the number of times we called push_front
, which is n * block_size
times (we called it n * block_size
times to insert n
blocks). Thus we have that the amortized cost is:
0.5 * n * (log2(n) - 1) / (n * block_size)
= 0.5 * (log2(n) - 1) / block_size
= (0.5 * log2(n) - 0.5) / block_size
Now, to get the Big Omicron, we get rid of the constants and the constant factors, and we replace log2 with log (since log base doesn't matter in Big O analysis), in order to arrive at the final result:
O(log n)
And that is the amortized time complexity of the push_front
operation in the libc++ deque implementation. Do note that this is a lower bound.
I have posted a more detailed explanation of how the libc++ implementation of the STL deque push_front here, if more context is required: https://1f604.blogspot.com/2021/11/the-libc-implementation-of-stl-deque.html
Is my analysis correct? If the amortized time complexity of the libc++ implementation of the STL deque is indeed O(log n), does that mean that it's not standards-compliant?