This should do the trick.
import random
import math
# create set to store samples
a = set()
# number of distinct elements in the population
m = 10
# sample size
k = 2
# number of samples
n = 3
# this protects against an infinite loop (see Safety Note)
if n > math.comb(m, k):
print(
f"Error: {math.comb(m, k)} is the number of {k}-combinations "
f"from a set of {m} distinct elements."
)
exit()
# the meat
while len(a) < n:
a.add(tuple(sorted(random.sample(range(m), k = k))))
print(a)
With a set
you are guaranteed to get a collection with no duplicate elements. In a set, you would be allowed to have (1, 2)
and (2, 1)
inside, which is why sorted
is applied. So if [1, 2]
is drawn, sorted([1, 2])
returns [1, 2]
. And if [2, 1]
is subsequently drawn, sorted([2, 1])
returns [1, 2]
, which won't be added to the set because (1, 2)
is already in the set. We use tuple
because objects in a set
have to be hashable and list
objects are not.
I hope this helps. Any questions, please let me know.
Safety Note
To avoid an infinite loop when you change 3 to some large number, you need to know the maximum number of possible samples of the type that you desire.
The relevant mathematical concept for this is a combination.
- Suppose your first argument of
random.sample()
is range(m)
where
m
is some arbitrary positive integer. Note that this means that the
sample will be drawn from a population of m
distinct members
without replacement.
- Suppose that you wish to have
n
samples of length k
in total.
The number of possible k
-combinations from the set of m
distinct elements is
m! / (k! * (m - k)!)
You can get this value via
from math import comb
num_comb = comb(m, k)
comb(m, k)
gives the number of different ways to choose k
elements from m
elements without repetition and without order, which is exactly what we want.
So in the example above, m = 10
, k = 2
, n = 3
.
With these m
and k
, the number of possible k
-combinations from the set of m
distinct elements is 45.
You need to ensure that n
is less than 45 if you want to use those specific m
and k
and avoid an infinite loop.