6

Why does numpy.random.choice not work the same as random.choice? When I do this :

 >>> random.choice([(1,2),(4,3)])
 (1, 2)

It works.

But when I do this:

 >>> np.random.choice([(1,2), (3,4)])
 Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "mtrand.pyx", line 1393, in mtrand.RandomState.choice 
 (numpy/random/mtrand/mtrand.c:15450)
 ValueError: a must be 1-dimensional

How do I achieve the same behavior as random.choice() in numpy.random.choice()?

Daniel Widdis
  • 8,424
  • 13
  • 41
  • 63
user2399453
  • 2,930
  • 5
  • 33
  • 60

2 Answers2

17

Well np.random.choice as noted in the docs, expects a 1D array and your input when expressed as an array would be 2D. So, it won't work simply like that.

To make it work, we can feed in the length of the input and let it select one index, which when indexed into the input would be the equivalent one from random.choice, as shown below -

out = a[np.random.choice(len(a))] # a is input

Sample run -

In [74]: a = [(1,2),(4,3),(6,9)]

In [75]: a[np.random.choice(len(a))]
Out[75]: (6, 9)

In [76]: a[np.random.choice(len(a))]
Out[76]: (1, 2)

Alternatively, we can convert the input to a 1D array of object dtype and that would allow us to directly use np.random.choice, as shown below -

In [131]: a0 = np.empty(len(a),dtype=object)

In [132]: a0[:] = a

In [133]: a0.shape
Out[133]: (3,)  # 1D array

In [134]: np.random.choice(a0)
Out[134]: (6, 9)

In [135]: np.random.choice(a0)
Out[135]: (4, 3)
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • 1
    Sure but its very clumsy to have to do it this way. I just want to randomly sample from a 1D array of objects. numpy.random.choice() should simply pick one index at random and return the corresponding object to me. Not sure why it doesnt mimic the random.choice() function which has the inutitive behavior. Enforcing that objects in my list be of a certain type really beats the purpose. – user2399453 Apr 27 '17 at 18:18
  • 3
    @user3079275 Well that's how `np.random.choice` is designed, either gotta live with it for such a case using the clumsy setup OR create our own custom one. Added another one with object dtype that circumvents it. – Divakar Apr 27 '17 at 18:24
  • 2
    `a[np.random.choice(len(a),size=(2,3))]` isn't clumsy, at least not in my opinion. It only adds one layer of indexing. With this I could just as easily pick random columns of `a`, or random planes in a higher dimension. – hpaulj Apr 27 '17 at 18:47
3

Relatedly, if you want to randomly sample rows of a 2D matrix like this

x = np.array([[1, 100], [2, 200], [3, 300], [4, 400]])

then you can do something like this:

n_rows = x.shape[0]
x[np.random.choice(n_rows, size=n_rows, replace=True), :]

Should work for a 2D matrix with any number of columns, and you can of course sample however many times you want with the size kwarg, etc.

Ben Vincent
  • 361
  • 1
  • 2
  • 11