9

The Standard says:

A deque is a sequence container that supports random access iterators (27.2.7). In addition, it supports constant time insert and erase operations at the beginning or the end; insert and erase in the middle take linear time.

However, it also says in the same Clause:

All of the complexity requirements in this Clause are stated solely in terms of the number of operations on the contained objects. [ Example: The copy constructor of type vector<vector<int>> has linear complexity, even though the complexity of copying each contained vector<int> is itself linear. — end example ]

Doesn't this mean that insertion at the beginning of, say, deque<int> is allowed to take linear time as long as it doesn't perform more than a constant number of operations on the ints that are already in the deque and the new int object being inserted?

For example, suppose that we implement a deque using a "vector of size-K vectors". It seems that once every K times we insert at the beginning, a new size-K vector must be added at the beginning, so all other size-K vectors must be moved. This would mean the time complexity of insertion at the beginning is amortized O(N/K) where N is the total number of elements, but K is constant, so this is just O(N). But it seems that this is allowed by the Standard, because moving a size-K vector doesn't move any of its elements, and the "complexity requirements" are "stated solely in terms of the number of operations" on the contained int objects.

Does the Standard really allow this? Or should we interpret it as having a stricter requirement, i.e. constant number of operations on the contained objects plus constant additional time?

halfer
  • 19,824
  • 17
  • 99
  • 186
Brian Bi
  • 111,498
  • 10
  • 176
  • 312
  • 2
    Your interpretation sounds plausible. To add more fuel to the fire, [deque.modifiers] has *An insertion in the middle of the deque invalidates all the iterators and references to elements of the deque. An insertion at either end of the deque invalidates all the iterators to the deque, but has no effect on the validity of references to elements of the deque.* Which like your vector example works since moving invalidates the iterators but not the references. – NathanOliver Apr 07 '20 at 21:51
  • i think the example is a little confusing because it uses "linear" once for linear with respect to number of element of the `vector>` but then linear with respect to the elements of the inner `vector`. If you only consider the number of elements of hte outer vector I'd consider copying the inner vector as constant, though I might be wrong, its already late here – 463035818_is_not_an_ai Apr 07 '20 at 22:01

2 Answers2

2

For example, suppose that we implement a deque using a "vector of size-K vectors".

That wouldn't be a valid implementation. Insertion at the front of a vector invalidates all of the pointers/references in the container. deque is required to not invalidate any pointers/references on front insertion.

But let's ignore that for now.

But it seems that this is allowed by the Standard, because moving a size-K vector doesn't move any of its elements, and the "complexity requirements" are "stated solely in terms of the number of operations" on the contained int objects.

Yes, that would be allowed. Indeed, real implementations of deque are not so dissimilar from that (though they don't use std::vector itself for obvious reasons). The broad outline of a deque implementation is an array of pointers to blocks (with space for growth at both the front and back), with each block containing up to X items as well as a pointers to the next/previous blocks (to make single-element iteration fast).

If you insert enough elements at the front or back, then the array of block pointers has to grow. That will require an operation that is linear time relative to the number of items in the deque, but doesn't actually operate on the items themselves, so it doesn't count. The specification has nothing to say about the complexity of that operation.

Without this provision, I'm not sure if the unique feature-set of deque (fast front/back insert and random-access) would be implementable.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • "An insertion at either end of the deque invalidates all the iterators to the deque, but has no effect on the validity of references to elements of the deque." – Brian Bi Apr 08 '20 at 15:10
  • Obviously an insertion at the front or back will *sometimes* take linear time. But if the array of pointers does indeed have space at the front, then the time should be amortized constant. If it doesn't have space at the front, then the time is not amortized constant. I am asking about whether the latter is allowed. – Brian Bi Apr 08 '20 at 15:11
  • @Brian: "*But if the array of pointers does indeed have space at the front, then the time should be amortized constant*" That's not what "[amortized constant](https://stackoverflow.com/q/200384/734069)" means. It means that the distance between the N's when it has to be non-constant is always increasing. If a `vector`'s implementation of automatic resizing only added a constant number of additional items, then insertion wouldn't be "amortized constant". So it's not about merely having space; you have to insert *more* space each time. – Nicol Bolas Apr 08 '20 at 15:21
  • @Brian: And yes, there's nothing about the `deque` requirements that either insert is "amortized constant" with respect to the number of items in the container; it is only constant with respect to the number of operations on items in the container. – Nicol Bolas Apr 08 '20 at 15:22
  • "Amortized constant" means that there is some constant c such that if you do N operations, then the total time taken is at most cN. Like I said, if the deque does not keep space at the front, then obviously insertions at the front cannot be amortized constant. If the deque does keep enough space at the front (more than a constant amount, as you say) then it is possible for it to be amortized constant. However, in both cases, it will be constant in terms of operations on elements. – Brian Bi Apr 08 '20 at 15:26
  • This argument about how much extra space is needed at the beginning is tangential. Again, the real question here is: if I need a structure that lets me push and pop ints at both the front and back efficiently, then can I rely on `std::deque` to be that structure? Or is there a possibility that pushing N ints at the front of a conforming `std::deque` will actually take N^2 CPU cycles? – Brian Bi Apr 08 '20 at 15:32
  • @Brian: "*if I need a structure that lets me push and pop ints at both the front and back efficiently, then can I rely on `std::deque` to be that structure?*" That all depends on how you define "efficiently". "*Or is there a possibility that pushing N ints at the front of a conforming std::deque will actually take N^2 operations?*" It's just as possible for pushing N `int`s at the back. Look, the standard says what it says. You can assume that all implementations are completely pathological with respect to what the standard doesn't say, or you can assume they know what they're doing. – Nicol Bolas Apr 08 '20 at 15:34
  • @Brian: To elaborate on that "efficiently" issue, `deque` isn't the most efficient data structure if *all* you want is O(1) insertion at the head/tail. `deque` also provides random-access iteration, which is why there's an array of pointers to blocks at all. If it only had to provide bidirectional iteration, then it could just be a linked-list of (possibly variably-sized) blocks. And if insertion/removal didn't have to preserve references, and insertion was allowed to be amortized constant time, then it could just be a circular buffer. – Nicol Bolas Apr 08 '20 at 15:46
1

I think you're reaching a bit, in how you interpret the meaning of complexity domains. You're trying to make a distinction between "linear time" and "linear complexity" which I'm not convinced makes much sense.

The standard's clear that insertion at the front is constant-time, and I think we all agree on that. The latter passage just tells us that what each of that "constant" quantity of operations involves underneath is simply not specified or constrained by the standard.

And this is not unusual. Any algorithm works on some basis of abstraction. Even if we were to write an algorithm that came down to individual machine instructions, and we said that there were only ever N machine instructions generated by our algorithm, we wouldn't go investigating what sort of complexity each individual complexity has inside the processor and add that into our results. We wouldn't say that some operations end up doing more on the quantum molecular level and thus our O(n) algorithm is actually O(N×M3) or somesuch. We've chosen not to consider that level of abstraction. And, unless said complexity depends on the algorithm's inputs, that's completely fine.

In your case, the size of the moved/copied inner vectors isn't really relevant, because these do not inherently change as the deque grows. The number of inner vectors does, but the size of each one is an independent property. Thus, it is irrelevant when describing the complexity of inserting a new element into the outer vector.

Does the actual execution time (which could itself be described in some algorithmical terms if you so chose) vary depending on the contents of the copied inner vectors? Yes, of course. But that has nothing to do with how the task of expanding the outer vector grows in workload as the outer vector itself grows.

So, here, the standard is saying that it will not copy N or N2 or even log N inner vectors when you put another one at the front; it is saying that the number of these operations will not change in scale as your deque grows. It is also saying that, for the purposes of that rule, it doesn't care what copying/moving the inner vectors actually involves or how big they are.

Complexity analysis is not about performance. Complexity analysis is about how performance scales.

Asteroids With Wings
  • 17,071
  • 2
  • 21
  • 35
  • Bootnote: if you really want, you can describe the complexity of the operation as not _O(1)_, but _O(A+B+C+D+E+F+...)_ where those inner variables relate to the size of the affected existing elements, or some other function of them... but that would be (a) not useful, (b) besides the point of the rule being defined in that passage, and (c) a waste of time, since the standard has already pretty much informed us that there will be no impact on those existing elements. – Asteroids With Wings Apr 08 '20 at 15:41