ith order statistic using C++'s STL

Question

Given an empty array, I need to make two type of queries

Inserting an element in the array
Finding the index of some element k (obviously the array has to be kept sorted)

This can be done be using set container

set<int> st;
set.insert(t);

This will insert my element in O(log(n)).

And for 2nd query

set<int>::iterator it;
it = st.find(k);
idx = distance(st.begin(), it);

This takes O(n) time. (O(n) [for distance()[ + O(log(n) [for set::find()] ).

Is there any way to do both queries in O(log(n)) using the predefined containers of C++?

http://www.cplusplus.com/reference/stl/

http://kera.name/articles/2010/08/it-is-not-called-the-stl-mmkay/ — Griwes, Feb 04 '13 at 18:58
**It can be done with GNU extension in the `libstdc++`, see [here](http://stackoverflow.com/a/23095152/341970).** — Ali, Apr 16 '14 at 13:34

score 5 · Answer 1 · answered Feb 04 '13 at 18:13

I don't think this is possible with the containers of the standard library since supporting access by index would require changing the implementation (add a counter to each node). This would increase the size of each node. And C++s philosophy is "don't pay what you don't use".

If you really need this, there's a countertree implementation suggested for boost (and it supports at least some of the C++11 features) which fulfills your requirements.

score 5 · Accepted Answer · answered Feb 05 '13 at 16:04

No. It is not possible (with the predefined containers). The sequence containers of the C++ Standard Library have either:

O(1) random access and O(N) insertion/removal or
O(N) random access and O(1) insertion/removal

Note that deque is an exception, but only when the insertion/removal takes place at the ends of the array. The general case is still O(N).

Furthermore, the classification of iterators does not include a category for this case. You have the bidirectional iterators (those of list, set, multiset, map and multimap), which take O(N) time to jump to a random position, and the next category is for random access iterators (those of vector, deque and string). There is no intermediate category.

Adding a new category would not be trivial at all. The library also implements a lot of algorithms (like for_each) that work with containers. There is an implementation for every iterator category.

Order statistic trees have been proposed at Boost several times. As far as I know:

2004: First suggestion (I don't know if it was implemented)
2006: "Hierarchical Data Structures"
2006: AVL Array (renamed as "Rank List" in Boost)
2012: Counter tree

The main difficulty for them being accepted was the generalized opinion that they were not a benefit, but a hazard. Today's programmers are used to solve all the problems they know with the typical containers. Experienced programmers fear that newbies might blindly use the proposed container for everything, instead of choosing carefully.

It's a pity that they never made it into boost, especially for such a stupid reason :( — wump, Aug 29 '13 at 09:34

score 0 · Answer 3 · answered Dec 24 '16 at 13:56

Although I agree that there is no completely inbuilt way of doing this in C++, there is one a good workaround: use segment tree. Let the segment tree denote the frequency of each element that has been encountered. Inserting an element is basically updating the count by 1 while query is just the segment sum operation from indices 0 to element - 1 . Both of these can be easily done in O(logn) . I know the downside is that you need to know the number of elements that can be present, but in many practical problems a good upper bound suffices.

ith order statistic using C++'s STL

3 Answers3

Linked