0

My question pertains to the std::map class in the STL of c++. My understanding is that it is implemented using red black trees to allow for O(logn) insertions, deletions, and lookups while maintaining order with respect to the keys.

I encountered a problem recently where I needed to access the ith element in the map, but the class only implements an iterator that can be advanced one entry at a time. In other words I would need to increment the iterator i times to get to the ith element in the map.

Ultimately I'm curious if it's possible to implement std::map such that the lookup of the ith element can be done in O(logn) time rather than O(n) time as it currently stands. I believe this is possible by keeping track of the number of nodes in the left and right subtrees of each node. That would allow for a search for the ith element like this (where node->num_left/right indicate the number of nodes in the left and right subtrees)

Node* getNodeAtIndex(int i) {
   Node *cur = root;
   int num_to_left=root->num_left, num_to_right=root->num_right;
   while (num_to_left != i) {
      if (i < num_to_left) { // move left
         cur = cur->left;
         num_to_left -= (1 + cur->num_right);
         num_to_right += (1 + cur->num_right);
      } else { // move right
         cur = cur->right;
         num_to_left += (1 + cur->num_left);
         num_to_right -= (1 + cur->num_left);
      }
   }
   return cur;
}

This post: Get index of element in C++ map relates very closely however many answers discuss how the ith element in a map is undefined. However, I believe it is defined in the sense that the "ith" element has i entries in the map with keys less than its key and map.size()-i-1 entries with keys greater than its key.

I'm sure this has been considered, I'm just wondering why it isn't implemented.

Nareg
  • 1
  • Short answer: Nope, that's not how `map`s work. Accessing by "index" is an inherently linear operation on `map`. – ShadowRanger Oct 24 '20 at 02:16
  • Thanks for linking that other post, is there an implementation like the one I proposed in any other c++ libraries? – Nareg Oct 24 '20 at 02:41
  • I doubt it. Among other things, your proposal would significantly increase the memory overhead; every node would add two `size_t` members (16 bytes on 64 bit code) that *only* got used for this use case that's not part of `map`'s intended function, and would be rarely used, if at all. Beyond that, I believe red-black trees' rebalancing operations, that they need to do to maintain their invariants, including guaranteed `O(log n)` operations, mean nodes can shift places in ways that would make it *very* hard if not impossible to keep the counts accurate without making rebalancing >`O(log n)`. – ShadowRanger Oct 24 '20 at 03:13
  • Okay that's fair, thanks! – Nareg Oct 24 '20 at 04:05
  • Note that it is possible to do this while only tracking the number of left children (possibly left children plus self). It adds some calculation overhead because you have to examine the parent node's count in order to get the size of the right sub-tree but you will very often already have that node anyway. Unfortunately the existing re-usable red-black implementations I'm aware of (primarily boost::intrusive::rbtree) aren't particularly suitable because of the count manipulations that go along with rotations, those implementations don't have a good place to hook for that. – SoronelHaetir Oct 24 '20 at 18:13
  • Interesting. It seems to me that in the case where you are doing lots of insertions and need to access the data structure by both key and index efficiently this data structure would be better than both map and flat_map because map is O(n) searching for the index and flat_maps require the array backed store to be adjusted on inserts, thus making inserts O(n). This would allow for O(logn) operations on inserts and accesses (although you do lose cache benefits with tree structures). Does this altered RBTree that we're talking about have a name? – Nareg Oct 24 '20 at 18:31
  • It's called an order statistic tree, and does not require an RB to implement, RB tree is just a possible implementation detail (AVL too, or not balanced at all for that matter). And note that the rotations aren't terribly more complicated, a left rotation for example need only add the weight (the node count) of the old parent to its former right child. A right rotation need only subtract the weight of the former left child from the original parent. And both of those nodes already need to be retrieved anyway. The changes do not bubble any higher. – SoronelHaetir Oct 25 '20 at 03:40

0 Answers0