1

I want to have a set of randomly selected N distinct numbers from {1,2,...(k-1),K} where K>N. I want to write a C program to efficiently do this.

Any help appreciated.

Note:

The naive program is just to generate random number, check whether it has been already generated and if not add it to the list is easy to implement. I was looking for something more efficient as if K and N are comparable then this will be just inefficient computation.

Aman Deep Gautam
  • 8,091
  • 21
  • 74
  • 130
  • If `K` and `N` are comparable, then a Fisher/Yates shuffle (`N` steps of it) would do well. That's pretty bad if `K` is much larger than `N`, then create-and-check-for-dupes is better. – Daniel Fischer Apr 29 '13 at 13:24
  • 2
    Much has already been said in the following SO questions: http://stackoverflow.com/questions/6947612/generating-m-distinct-random-numbers-in-the-range-0-n-1, http://stackoverflow.com/questions/2576023/algorithm-to-generate-1000-distinct-integers-in-the-range-0-8000, http://stackoverflow.com/questions/693880/create-random-number-sequence-with-no-repeats – harpun Apr 29 '13 at 13:32

2 Answers2

3

When K and N is comparable, use Knuth shuffle, but limit the shuffle to just N times.

Note when they are not, this could waste a lot of memory.

zw324
  • 26,764
  • 16
  • 85
  • 118
2

Reservoir sampling is a family of randomized algorithms for randomly choosing k samples from a list S containing n items, where n is either a very large or unknown number.”

(note that the roles of K and N are reversed on that Wikipedia page with respect to your question)

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281