70

What is a good way to get a [pseudo-]random element from an STL range?

The best I can come up with is to do std::random_shuffle(c.begin(), c.end()) and then take my random element from c.begin().

However, I might want a random element from a const container, or I might not want the cost of a full shuffle.

Is there a better way?

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
paperjam
  • 8,321
  • 12
  • 53
  • 79
  • I don't really understand what you want. Do you just want a random number each time instead of a series ? If so you can just call sran(time(NULL)); then call rand(); – RoundPi Aug 04 '11 at 13:31

9 Answers9

67

I posted this solution on a Google+ article where someone else referenced this. Posting it here, as this one is slightly better than others because it avoids bias by using std::uniform_int_distribution:

#include  <random>
#include  <iterator>

template<typename Iter, typename RandomGenerator>
Iter select_randomly(Iter start, Iter end, RandomGenerator& g) {
    std::uniform_int_distribution<> dis(0, std::distance(start, end) - 1);
    std::advance(start, dis(g));
    return start;
}

template<typename Iter>
Iter select_randomly(Iter start, Iter end) {
    static std::random_device rd;
    static std::mt19937 gen(rd());
    return select_randomly(start, end, gen);
}

Sample use is:

#include <vector>
using namespace std;

vector<int> foo;
/* .... */
int r = *select_randomly(foo.begin(), foo.end());

I ended up creating a gist with a better design following a similar approach.

Christopher Smith
  • 5,372
  • 1
  • 34
  • 18
  • 1
    Yes, this is very well explained here: http://channel9.msdn.com/Events/GoingNative/2013/rand-Considered-Harmful – klimkin Sep 16 '13 at 22:15
  • I have a question: why don't you combine the first select_randomly into the second select_randomly ? Beside the fact that you want to re-use the code of the first select_randomly elsewhere. – 123iamking Aug 04 '17 at 08:01
  • 1
    according to this answer: https://stackoverflow.com/a/40656641/1701248 function select_randomly should use mutex inside to be thread safe. – ctinka May 16 '20 at 23:08
57

C++17 std::sample

This is a convenient method to get several random elements without repetition.

main.cpp

#include <algorithm>
#include <iostream>
#include <random>
#include <vector>

int main() {
    const std::vector<int> in{1, 2, 3, 5, 7};
    std::vector<int> out;
    size_t nelems = 3;
    std::sample(
        in.begin(),
        in.end(),
        std::back_inserter(out),
        nelems,
        std::mt19937{std::random_device{}()}
    );
    for (auto i : out)
        std::cout << i << std::endl;
}

Compile and run:

g++-7 -o main -std=c++17 -Wall -Wextra -pedantic main.cpp
./main

Output: 3 random numbers are picked from 1, 2, 3, 5, 7 without repetition.

For efficiency, only O(n) is guaranteed since ForwardIterator is the used API, but I think stdlib implementations will specialize to O(1) where possible (e.g. vector).

Tested in GCC 7.2, Ubuntu 17.10. How to obtain GCC 7 in 16.04.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
34

All the answers using % here are incorrect, since rand() % n will produce biased results: imagine RAND_MAX == 5 and the number of elements is 4. Then you'll get twice more the number 0 and 1 than the numbers 2 or 3.

A correct way to do this is:

template <typename I>
I random_element(I begin, I end)
{
    const unsigned long n = std::distance(begin, end);
    const unsigned long divisor = (RAND_MAX + 1) / n;

    unsigned long k;
    do { k = std::rand() / divisor; } while (k >= n);

    std::advance(begin, k);
    return begin;
}

Another problem is that std::rand is only assumed to have 15 random bits, but we'll forget about this here.

Alexandre C.
  • 55,948
  • 11
  • 128
  • 197
  • Yeah, that's a better use of std::rand(). – Dave S Aug 04 '11 at 15:00
  • @AlexandreC. : Sorry - I downvoted incorrectly, and now the vote is locked in. Upvoted one of your other correct answers for balance. – Deestan Jun 19 '12 at 15:40
  • 3
    Am I correct that std::advance will return void, I've just looked into http://www.sgi.com/tech/stl/advance.html. – math Aug 07 '12 at 06:48
  • And we could return either `end` or `begin` in case std::distance returns 0. – math Aug 07 '12 at 07:03
  • 5
    Now, I wonder why no one have had the same thoughts as I do: My compiler warns me for an integer overflow in expression ( RAND_MAX + 1 ). So could you explain why we need the +1 here? – math Aug 07 '12 at 07:18
  • 2
    @AlexandreC.: Doesn't the bias only become significant if the container size is close to RAND_MAX? (RAND_MAX is at least 32767 on any standard library implementation.) – jlstrecker Feb 28 '13 at 19:50
  • @jlstrecker, it all depends on your definition of significant. The beauty of this method is that it does less work as the chances of bias become smaller. – Mark Ransom May 07 '13 at 15:07
  • 3
    @math the *range* of possible values from `rand` is 0 to RAND_MAX, so the *count* of possible values is RAND_MAX+1. If that results in overflow you need to use a larger integer type, or find a different method of eliminating the bias. – Mark Ransom May 07 '13 at 15:09
  • 1
    Still seems like it kind of screams for std::uniform_int_distribution no? That way the code is simpler & clearer. – Christopher Smith May 08 '13 at 01:25
  • At which point the probability of extra work is: (RAND_MAX - (divisor*n))/RAND_MAX. That's proportional to the size of the bias depending on how you define the "size" of the bias (really it's proportional to the # of values with extra chances of being selected). – Christopher Smith May 08 '13 at 01:45
  • 1
    @ChristopherSmith: Yes, but that is C++11. Also MSVC10's version is buggy. The issue with this code is `std::rand`: this ought to be replaced by a better generator. – Alexandre C. May 08 '13 at 14:16
  • @klimkin: Visual Studio 2010's uniform_int_distribution has bias issues, so this should be avoided in portable code. Also my answer is from before C++11. – Alexandre C. Sep 17 '13 at 07:03
  • Note that for a std::list, `std::distance(begin,end)` is linear time but `list::size()` is constant time ([guaranteed since C++11, optional before that]). So if you have a container that has a good reason for being a list, not random access, but you occasionally need a random element from it, you should use `.size()` to avoid an extra traversal over the whole list (beyond std::advance's traversal up to the selected element). If it was worth optimizing (but not changing container), you could even decide whether to traverse from the start or end based on your random number being > half. – Peter Cordes Oct 05 '16 at 17:01
9

This works fine as long as RAND_MAX is much greater than the container size, otherwise it suffers from the bias problem cited by Alexandre:

vector<int>::iterator randIt = myvector.begin();
std::advance(randIt, std::rand() % myvector.size());
Antonio
  • 19,451
  • 13
  • 99
  • 197
cprogrammer
  • 5,503
  • 3
  • 36
  • 56
3

If you can't access the size, I think you would want to do the following. It returns the iterator to the random element.

#include <algorithm>
#include <iterator>

template <class InputIterator> InputIterator 
random_n(InputIterator first, InputIterator last) {
   typename std::iterator_traits<InputIterator>::difference_type distance = 
        std::distance(first, last);
   InputIterator result = first;
   if (distance > 1) {
      // Uses std::rand() naively.  Should replace with more uniform solution. 
      std::advance( result, std::rand() % distance );
   }
   return result;
}
// Added in case you want to specify the RNG.  RNG uses same 
// definition as std::random_shuffle
template <class InputIterator, class RandomGenerator> InputIterator 
random_n(InputIterator first, InputIterator last, RandomGenerator& rand) {
   typename std::iterator_traits<InputIterator>::difference_type distance = 
       std::distance(first, last);
   InputIterator result = first;
   if (distance > 1) {
      std::advance( result, rand(distance) );
   }
   return result;
}
Dave S
  • 20,507
  • 3
  • 48
  • 68
  • I like this solution most. If the iterators are random access, it is O(1), but if need be it also works with simple input iterators. – Björn Pollex Aug 04 '11 at 13:41
  • 5
    This method (and all the others that were posted) will produce biased output if `RAND_MAX` isn’t evenly divisible by the number of elements. For large(-ish) containers the bias will be quite pronounced. Use a proper [uniform deviate](http://eternallyconfuzzled.com/arts/jsw_art_rand.aspx). – Konrad Rudolph Aug 04 '11 at 13:49
  • Realistically, you'd probably want a version that allows the user to provide a functor for the RNG. That said, the libstdc++ random_shuffle does rand() % length by default, so I just used the same logic. – Dave S Aug 04 '11 at 13:50
  • 1
    @DaveS Eek! It does? How horrid! – Konrad Rudolph Aug 04 '11 at 13:51
  • @Konrad: this cannot be: ISO/IEC 14882/2003, 25.2.11: "Shuffles the elements in the range [first, last) with uniform distribution". So `rand() % n` cannot be standard compliant. – Alexandre C. Aug 04 '11 at 13:59
  • @Konrad: By the way, VS2010 does the right thing (correct accept-reject method with `std::rand`). – Alexandre C. Aug 04 '11 at 14:02
  • @Konrad: My bad, it doesn't. I was impressed by the obfuscation there in `` but in fact it is only fudge to accomodate the fact that `std::rand` has only 15 guaranteed random bits. So, I agree with you: eek ! – Alexandre C. Aug 04 '11 at 14:10
2

Take the number of elements, c.size(), then get a random_number between 0 and c.size(), and use:

auto it = c.begin();
std::advance(it, random_number)

Have a look at http://www.cplusplus.com/reference/clibrary/cstdlib/rand/

ypnos
  • 50,202
  • 14
  • 95
  • 141
  • 6
    use rather `*std::advance(c.begin(), random_number)`. Iterators from `std::set` for instance are only bidirectional, not random access. – Alexandre C. Aug 04 '11 at 13:30
  • 2
    @Alexandre: On the other hand, the OP currently uses `random_shuffle` which requires random-access-iterators, so he seems to be fine with that restriction. – Björn Pollex Aug 04 '11 at 13:42
  • @Björn: good point. Using `std::distance` is more generic though (and the OP wants something which acts on a range). – Alexandre C. Aug 04 '11 at 13:57
  • 2
    @AlexandreC. [`std::advance` returns `void`](http://en.cppreference.com/w/cpp/iterator/advance), unfortunately, so you need to advance and dereference separately. – Peter Cordes Oct 05 '16 at 16:56
1

You can try to get a random number between 0 and the number of elements of the container. You could then access to the corresponding element of the container. For example, you can do this:

#include <cstdlib>
#include <ctime>

// ...
std::srand(std::time(0)); // must be called once at the start of the program
int r = std::rand() % c.size() + 1; 
container_type::iterator it = c.begin();
std::advance(it, r);
Cedekasme
  • 2,027
  • 1
  • 16
  • 16
  • 1
    Just make sure you call `srand()` ONCE, at the beginning of the program. If you call it before every `rand()` call, you won't be happy with the randomness of the result! – Fred Larson Aug 04 '11 at 13:43
  • This also has issues with non-uniform distribution of selections for most sizes of containers, much like all the other examples that don't use std::uniform_int_distribution. – Christopher Smith May 07 '13 at 19:43
  • This code has an off by 1 error too. r's possible values are [1,c.size()] inclusive. It will never therefore never choose the first element and will periodically choose an element one past the end of the vector. It also suffers from the bias problem. – Christopher Smith May 08 '13 at 01:44
0
#include <vector>
using std::vector;

vector<int> vi = {1, 2, 3, 4, 5};
srand(time(0));
randomPosition = rand() % vi.size();
randomElement = vi[randomPosition]

srand(time(0)) seed for random, if not use it, it will get same random integer every execution.

max
  • 187
  • 1
  • 2
  • 9
-2

You can use 0~1 random function to generate a float number for every element in the container as its score. And then select the one with highest score.

zcc
  • 1