9

I want to find rank of an element in stl set. I am able to traverse from beginning to that element and find out its rank but that is taking O(n). Is there any method to find the rank in O(logn).

TemplateRex
  • 69,038
  • 19
  • 164
  • 304
Vineet Setia
  • 347
  • 1
  • 4
  • 11

5 Answers5

6

No; a balanced tree does not need to store the number of descendants of each node, which is required to more quickly compute distance( s.begin(), iter ) for std::set s and iterator iter (which is what I suppose you mean). Therefore the information simply doesn't exist except by counting the items one by one.

If you need to perform many such computations, copy the set into a sorted, random-access sequence such as vector or deque, but then modification of the sequence becomes expensive.

A tree data structure that does what you ask probably exists in a free library somewhere, but I don't know of one.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
  • I can create a balanced BST and use it. But I read somewhere that we can find rank in std::set. Thats y curious to know how to find it. – Vineet Setia Aug 07 '13 at 05:22
  • 1
    @user2613413 If "rank" is what I just described, then the notation to find it is what I mentioned: `std::distance( s.begin(), iter )`. But the algorithm is just a simple O(N) loop. If you substitute a suitable container, `distance` should employ an O(log N) algorithm instead, but no such container exists in the Standard Library. For `vector` and `deque`, it is O(1). – Potatoswatter Aug 07 '13 at 06:25
  • @Potatoswatter Boost.MultiIndex does provide random access funtionality by adding some extra pointer per node in the tree. – TemplateRex Aug 07 '13 at 06:43
  • @TemplateRex Random access means O(1) to increment an iterator by any distance. I can't believe that merely adding one pointer to each node exposes an O(1) path from each node to any other. It sounds like those pointers go to a central table which needs to be updated in O(N) for various operations. To know how far you're going by skipping a subtree, you need the total size of the subtree, not a pointer. And keeping those sizes updated is more expensive on average than more popular methods like red-black. – Potatoswatter Aug 07 '13 at 06:54
  • Tnx, upon closer reading of the docs, Boost.MultiIndex appears to do the same as Boost.Container's flat_set, i.e. keeping a sorted vector of indices and 1 pointer per node into that vector. This does give random access iterators but at the cost of linear insertion/erase. I updated my answer so that now only Boost.Container is mentioned. – TemplateRex Aug 07 '13 at 07:05
4

What you are looking for is called an Order Statistic Tree. If you are using GNU C++ library, you should have an extension available for building order statistic trees. A short example is given below:

#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
#include <cstdio>
using namespace std;
using namespace pb_ds;

typedef tree<
int, /* key type */
null_mapped_type, /* value type */
less<int>, /* comparison */
rb_tree_tag, /* for having an rb tree */
tree_order_statistics_node_update> order_set;

int main()
{
    order_set s;
    s.insert(10);
    s.insert(20);
    s.insert(50);
    s.insert(25);

    printf("rank of 25 = %d\n", s.order_of_key(25));
}

Output should be rank of 25 = 2. For more examples, you can see this file.

Subhasis Das
  • 1,667
  • 13
  • 13
2

The functionality of a sorted vector suggested by @Potatoswatter is provided by the flat_set from Boost.Container. The documentation lists the following trade-offs

  • Faster lookup than standard associative containers
  • Much faster iteration than standard associative containers
  • Less memory consumption for small objects (and for big objects if shrink_to_fit is used)
  • Improved cache performance (data is stored in contiguous memory)
  • Non-stable iterators (iterators are invalidated when inserting and erasing elements)
  • Non-copyable and non-movable values types can't be stored
  • Weaker exception safety than standard associative containers (copy/move constructors can throw when shifting values in erasures and insertions)
  • Slower insertion and erasure than standard associative containers (specially for non-movable types)
TemplateRex
  • 69,038
  • 19
  • 164
  • 304
1

There is actually a built-in solution if you are using GCC, but Subhasis Das's answer is somewhat outdated and will not work with newer versions of GCC due to updates. The header is now

#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
using namespace __gnu_pbds;

and the set structure is

typedef tree<
int, 
null_type, 
std::less<int>, 
rb_tree_tag, 
tree_order_statistics_node_update> ordered_set;

Alternatively, if a multiset is needed, the std::less<int> can be repalced with std::less_equal<int>.

Here's a demonstration of find-by-rank:

#include <ext/pb_ds/assoc_container.hpp>
#include <ext/pb_ds/tree_policy.hpp>
using namespace __gnu_pbds;
#include <iostream>

typedef tree<int, null_type, std::less_equal<int>, rb_tree_tag, tree_order_statistics_node_update> ordered_set;

int main()
{
    ordered_set s;
    s.insert(10);
    s.insert(20);
    s.insert(50);
    s.insert(25);

    for(int i=24; i<=26; i++) std::cout << "Rank of " << i << ": " << s.order_of_key(i) << std::endl;
}
Baaing Cow
  • 1,492
  • 9
  • 20
-1

I think there is a lower_bound function in the STL set in C++ and this can be used to find the rank of an element in a set. Take a look at https://www.geeksforgeeks.org/set-lower_bound-function-in-c-stl/.