Randomly selecting N distinct numbers without repetition

Question

I want to have a set of randomly selected N distinct numbers from {1,2,...(k-1),K} where K>N. I want to write a C program to efficiently do this.

Any help appreciated.

Note:

The naive program is just to generate random number, check whether it has been already generated and if not add it to the list is easy to implement. I was looking for something more efficient as if K and N are comparable then this will be just inefficient computation.

If `K` and `N` are comparable, then a Fisher/Yates shuffle (`N` steps of it) would do well. That's pretty bad if `K` is much larger than `N`, then create-and-check-for-dupes is better. — Daniel Fischer, Apr 29 '13 at 13:24
Much has already been said in the following SO questions: http://stackoverflow.com/questions/6947612/generating-m-distinct-random-numbers-in-the-range-0-n-1, http://stackoverflow.com/questions/2576023/algorithm-to-generate-1000-distinct-integers-in-the-range-0-8000, http://stackoverflow.com/questions/693880/create-random-number-sequence-with-no-repeats — harpun, Apr 29 '13 at 13:32

zw324 · Accepted Answer · 2013-04-29T14:01:21.277

3

When K and N is comparable, use Knuth shuffle, but limit the shuffle to just N times.

Note when they are not, this could waste a lot of memory.

edited Apr 29 '13 at 14:01

answered Apr 29 '13 at 13:24

zw324

26,764
16
85
118

You can shuffle just once and pick up the first N items, right? Knuth Shuffle is linear time and memory in K. – anoopelias Apr 29 '13 at 14:00
@anoopelias Thanks, I took K as the number of elements the OP wanted. – zw324 Apr 29 '13 at 14:01

score 2 · Answer 2 · answered Apr 29 '13 at 13:25

“Reservoir sampling is a family of randomized algorithms for randomly choosing k samples from a list S containing n items, where n is either a very large or unknown number.”

(note that the roles of K and N are reversed on that Wikipedia page with respect to your question)

Randomly selecting N distinct numbers without repetition

2 Answers2