47

what is a good way to select a random element from a map? C++. It is my understanding that maps don't have random access iterators. The key is a long long and the map is sparsely populated.

Deathbob
  • 1,077
  • 2
  • 10
  • 11
  • 5
    Why do you need to do this? Using a map implies you want fast lookup based on a key, random lookup is going to be O(N)... – Greg Rogers Oct 01 '08 at 18:07

9 Answers9

49
map<...> MyMap;
iterator item = MyMap.begin();
std::advance( item, random_0_to_n(MyMap.size()) );
bobobobo
  • 64,917
  • 62
  • 258
  • 363
James Curran
  • 101,701
  • 37
  • 181
  • 258
16

I like James' answer if the map is small or if you don't need a random value very often. If it is large and you do this often enough to make speed important you might be able to keep a separate vector of key values to select a random value from.

map<...> MyMap;
vector<...> MyVecOfKeys; // <-- add keys to this when added to the map.

map<...>::key_type key = MyVecOfKeys[ random_0_to_n(MyVecOfKeys.size()) ];
map<...>::data_type value = MyMap[ key ];

Of course if the map is really huge you might not be able to store a copy of all the keys like this. If you can afford it though you get the advantage of lookups in logarithmic time.

ryan_s
  • 7,826
  • 3
  • 27
  • 28
  • The map isn't very big right now so i'm interested in doing it the easy way, but it could grow to be large. It might then be worth customizing the map class a little bit to allow me to pick a random key in this manner without storing a redundant vector. Is there a way to do that already? – Deathbob Oct 01 '08 at 18:17
  • I don't think there is a good way to get at the keys of a map without resorting to traversing the map in linear time, owing to the fact that it is stored as a tree. Are you sure a map is the best data structure for your purposes? – ryan_s Oct 01 '08 at 18:25
  • I think we can speed this up a little, see my answer. – Constantin Oct 03 '08 at 23:13
9

Maybe draw up a random key, then use lower_bound to find the closest key actually contained.

Assaf Lavie
  • 73,079
  • 34
  • 148
  • 203
  • 3
    If the keys are evenly distributed that might be a good idea. If they aren't you'd wind up getting one key more often than the others. – ryan_s Oct 01 '08 at 20:45
  • 1
    This is indeed the way to sample from the [empirical distribution](http://en.wikipedia.org/wiki/Empirical_measure) of a sample. – Alexandre C. Dec 18 '12 at 21:28
5

Continuing ryan_s theme of preconstructed maps and fast random lookup: instead of vector we can use a parallel map of iterators, which should speed up random lookup a bit.

map<K, V> const original;
...

// construct index-keyed lookup map   
map<unsigned, map<K, V>::const_iterator> fast_random_lookup;
map<K, V>::const_iterator it = original.begin(), itEnd = original.end();
for (unsigned i = 0; it != itEnd; ++it, ++i) {
  fast_random_lookup[i] = it;
}

// lookup random value
V v = *fast_random_lookup[random_0_to_n(original.size())];
Constantin
  • 27,478
  • 10
  • 60
  • 79
2

If your map is static, then instead of a map, use a vector to store your key/value pairs in key order, binary search to look up values in log(n) time, and the vector index to get random pairs in constant time. You can wrap the vector/binary search to look like a map with a random access feature.

ejgottl
  • 2,809
  • 19
  • 18
1

Maybe you should consider Boost.MultiIndex, although note that it's a little too heavy-weighted.

Weipeng
  • 1,494
  • 14
  • 11
0

Here is the case when all map items must be access in random order.

  1. Copy the map to a vector.
  2. Shuffle vector.

In pseudo-code (It closely reflects the following C++ implementation):

import random
import time

# populate map by some stuff for testing
m = dict((i*i, i) for i in range(3))
# copy map to vector
v = m.items()
# seed PRNG   
#   NOTE: this part is present only to reflect C++
r = random.Random(time.clock()) 
# shuffle vector      
random.shuffle(v, r.random)
# print randomized map elements
for e in v:
    print "%s:%s" % e, 
print

In C++:

#include <algorithm>
#include <iostream>
#include <map>
#include <vector>

#include <boost/date_time/posix_time/posix_time_types.hpp>
#include <boost/foreach.hpp>
#include <boost/random.hpp>

int main()
{
  using namespace std;
  using namespace boost;
  using namespace boost::posix_time;

  // populate map by some stuff for testing
  typedef map<long long, int> Map;
  Map m;
  for (int i = 0; i < 3; ++i)
    m[i * i] = i;

  // copy map to vector
#ifndef OPERATE_ON_KEY
  typedef vector<pair<Map::key_type, Map::mapped_type> > Vector;
  Vector v(m.begin(), m.end());
#else
  typedef vector<Map::key_type> Vector;
  Vector v;
  v.reserve(m.size());
  BOOST_FOREACH( Map::value_type p, m )
    v.push_back(p.first);
#endif // OPERATE_ON_KEY

  // make PRNG
  ptime now(microsec_clock::local_time());
  ptime midnight(now.date());
  time_duration td = now - midnight;
  mt19937 gen(td.ticks()); // seed the generator with raw number of ticks
  random_number_generator<mt19937, 
    Vector::iterator::difference_type> rng(gen);

  // shuffle vector
  //   rng(n) must return a uniformly distributed integer in the range [0, n)
  random_shuffle(v.begin(), v.end(), rng);

  // print randomized map elements
  BOOST_FOREACH( Vector::value_type e, v )
#ifndef OPERATE_ON_KEY
    cout << e.first << ":" << e.second << " ";
#else
    cout << e << " ";
#endif // OPERATE_ON_KEY
  cout << endl;
}
jfs
  • 399,953
  • 195
  • 994
  • 1,670
0

Has anyone tried this? https://github.com/mabdelazim/Random-Access-Map "C++ template class for random access map. This is like the std::map but you can access items random by index with syntax my_map.key(i) and my_map.data(i)"

user3259383
  • 149
  • 1
  • 9
0
std::random_device dev;
std::mt19937_64 rng(dev());

std::uniform_int_distribution<size_t> idDist(0, elements.size() - 1);
auto elementId= elements.begin();
std::advance(elementId, idDist(rng));

Now elementId is random :)

Matheus Toniolli
  • 438
  • 1
  • 7
  • 16