15

Possible Duplicate:
pick N items at random

I need to generate 6 random numbers between 1 and 49, but they cannot be the same. I know how to do make them random, I just am not sure how to ensure that they are different.

The worksheet recommends displaying each number and setting it to zero, but I don't see how that would help.

Any advice is greatly appreciated.

Community
  • 1
  • 1
keirbtre
  • 1,431
  • 6
  • 15
  • 14
  • 4
    You should post what you have tried. – asheeshr Nov 29 '12 at 15:12
  • You realize of course that if they can't be the same, by definition they are no longer really random. – Daniel Roseman Nov 29 '12 at 15:23
  • Yes, they are still random, just that they're pulled from a slightly smaller list. – keirbtre Nov 29 '12 at 15:34
  • @keirbtre: By enforcing any constraint on random numbers, you've made those numbers less random. – Chris Laplante Nov 29 '12 at 15:54
  • I'm aware that they are less random, but they are still random nevertheless. You cannot know what number will be picked. – keirbtre Nov 29 '12 at 23:26
  • @finnw This is really not a duplicate. "Picking N items at random from sequence of unknown length" is entirely different kind of problem due to the unknown length. For cases with known length there are available shortcuts that are not available with unknown sequence length. Also the distributions of selected numbers will be different. Please remove the misleading duplicate flag. – Roland Pihlakas Jan 05 '18 at 01:13

3 Answers3

42

You can use random.sample:

>>> random.sample(xrange(1,50), 6)
[26, 39, 36, 46, 37, 1]

"The worksheet recommends displaying each number and setting it to zero, but I don't see how that would help."

Assuming this is an assignment and you need to implement the sampling yourself, you could take a look at how random.sample is implemented. It's really informative, but may be too complicated for your needs since the code also ensures that all sub-slices will also be valid random sample. For efficiency, it also uses different approaches depending on the population size.

As for the worksheet, I believe it assumes you're starting off with a list of numbers from 1 to 49 and suggests that you replace numbers that you're selected with 0 so there can be skipped if reselected. Here's some pseudo code to get you started:

population = range(1, 50)  # list of numbers from 1 to 49
sample = []
until we get 6 samples:
  index = a random number from 0 to 48  # look up random.randint()
  if population[index] is not 0:  # if we found an unmarked value
    append population[index] to sample
    set population[index] = 0  # mark selected

If you wish to attempt something different, there are many other approaches to consider e.g. randomising the list then truncating, or some form of reservoir sampling.

Good luck with your assignment.

Shawn Chin
  • 84,080
  • 19
  • 162
  • 191
13

A set will not keep any duplicates:

s = set()
while len(s) < 6:
    s.add(get_my_new_random_number())
E.Z.
  • 6,393
  • 11
  • 42
  • 69
2

It is a very common and stupid interviews question, here is its solution/algorithm:

import random
a = range(1,50)
for i in xrange(6):
    b = a[random.randint(0,len(a)-i)]
    a.remove(b)
    print b

For the people cared about the efficiency here is the test bench of my solution and Chin's:

>>> random.sample(xrange(1,50), 6)
[26, 39, 36, 46, 37, 1]

The results:

>python -mtimeit -s'import try2'
[38, 7, 31, 24, 30, 32]
100000000 loops, best of 3: 0.0144 usec per loop
>python -mtimeit -s'import try1'
36
26
41
31
37
14
100000000 loops, best of 3: 0.0144 usec per loop

resolved to be the same time!

0x90
  • 39,472
  • 36
  • 165
  • 245