1

Is there a way to randomly select an element from an std::set in O(log(n)) or even better O(1) time? It doesn't need to have an extremely even distribution, just something fairly random (though obviously even is better)

Looking at std::set::extract seems like it might be promising since it could potentially divide the set in half in constant time, but I can't find a good way to identify the nodes close to the root, and the node_type documentation I could find doesn't go into much detail.

If all else fails, the contents could be copied to a std::map with random keys in what I think would be O(n log(n)) time, which would get me amortized O(log(n)) time, but that is not the preferred solution since it would require some overhead in the cases where I don't want everything

rtpax
  • 1,687
  • 1
  • 18
  • 32
  • 1
    Do you need to use a `std::set`? If you want random access then you should use a container that provides random access. – NathanOliver Aug 21 '18 at 15:52
  • @NathanOliver Yes, `std::set` gives noticable performance benefits elsewhere in my code, I would rather not use a different container – rtpax Aug 21 '18 at 16:00
  • `std::set` already provides `O(ln(n))` access time. If you want `O(1)` then you should use `std::unordered_set`. see: https://stackoverflow.com/questions/181693/what-are-the-complexity-guarantees-of-the-standard-containers – Martin York Aug 21 '18 at 16:49

2 Answers2

2

If the elements themselves within the set are evenly distributed within some domain of values, then you could generate a random value within that domain, and use std::set::lower_bound to get the first element contained in the set that is not less than the random value.

Given that you don't need an extremely even distribution, the evenness requirement of the elements within the set might not be extremely necessary. The likelihood of an element to be chosen depends on its comparative distance from its predecessor element.

For an even distribution, I don't think there's better than *std::next(std::begin(s), random_index), which is linear in complexity.


For a good general solution with even distribution, and logarithmic asymptotic complexity, you need another data structure than std::set.

In particular, one good choice is the Order statistic tree which augments a search tree by adding into a node the size of its subtree. OST has a Select(i) operation, which is similar to array subscript operation, and you can pick a random element between index 0...N in the same way.

Another option is to use a sorted array. The sorted property can be used to keep the logarithmic lookup property that std::set has.

Unfortunately though, the standard library has neither order statistic tree, nor a sorted-array-set.

eerorika
  • 232,697
  • 12
  • 197
  • 326
0

std::set complexity is o(log(n)) so you can't lower search complexity bellow that one using std::set alone. You may use use an indexing structure like a vector to achieve that.

Besides that you cant achieve a random seach using this code :

 std::random_device              rd;
 std::mt19937                    gen(rd());
 std::uniform_int_distribution<> dis( 0, set::size ( ) );

then

 set::operator [] (dis(gen));

or something appoching

Gold
  • 136
  • 6