So I am wondering if there's a more efficient solution in generating a 2-D array using np.random.choice
where each row has unique values.
For example, for an array with shape (3,4)
, we expect an output of:
# Expected output given a shape (3,4)
array([[0, 1, 3, 2],
[2, 3, 1, 0],
[1, 3, 2, 0]])
This means that the values for each row must be unique with respect to the number of columns. So for each row in out
, the integers should only fall between 0 to 3.
I know that I can achieve it by passing False
to the replace
argument. But I can only do it for each row and not for the whole matrix. For instance, I can do this:
>>> np.random.choice(4, size=(1,4), replace=False)
array([[0,2,3,1]])
But when I try to do this:
>>> np.random.choice(4, size=(3,4), replace=False)
I get an error like this:
File "<stdin>", line 1, in <module>
File "mtrand.pyx", line 1150, in mtrand.RandomState.choice
(numpy\random\mtrand\mtrand.c:18113)
ValueError: Cannot take a larger sample than population when
'replace=False'
I assume it's because it's trying to draw 3 x 4 = 12
samples due to the size of the matrix without replacement but I'm only giving it a limit of 4
.
I know that I can solve it by using a for-loop
:
>>> a = (np.random.choice(4,size=4,replace=False) for _ in range(3))
>>> np.vstack(a)
array([[3, 1, 2, 0],
[1, 2, 0, 3],
[2, 0, 3, 1]])
But I wanted to know if there's a workaround without using any for-loops? (I'm kinda assuming that adding for-loops might make it slower if I have a number of rows greater than 1000. But as you can see I am actually creating a generator in a
so I'm also not sure if it has an effect after all.)