Similar to Numpy random choice to produce a 2D-array with all unique values, I am looking for an efficient way of generating:
n = 1000
k = 10
number_of_combinations = 1000000
p = np.random.rand(n)
p /= np.sum(p)
my_combinations = np.random.choice(n, size=(number_of_combinations, k), replace=False, p=p)
As discussed in the previous question, I want this matrix to have only unique rows. Unfortunately, the provided solutions do not work for the additional extension of using specific probabilities p.
My current solution is as follows:
my_combinations = set()
while len(my_combinations) < number_of_combinations:
new_combination = np.random.choice(n, size=k, replace=False, p=p)
my_combinations.add(frozenset(new_combination))
print(my_combinations)
However, I do think that there should be a more efficient numpy approach to solve this faster.