1

To select random values from a 2d array, you can use this

pool =  np.random.randint(0, 30, size=[4,5])
seln = np.random.choice(pool.reshape(-1), 3, replace=False)

print(pool)
print(seln)

>[[29  7 19 26 22]
 [26 12 14 11 14]
 [ 6  1 13 11  1]
 [ 7  3 27  1 12]]
[11 14 26]

pool needs to be reshaped into a 1-d vector because np.random.choice can not handle 2d objects. So in order to create a 2d array composed of randomly selected values from the original 2d array, I had to do one row at a time using a loop.

pool =  np.random.randint(0, 30, size=[4,5])
seln = np.empty([4,3], int)

for i in range(0, pool.shape[0]):
    seln[i] =np.random.choice(pool[i], 3, replace=False) 

print('pool = ', pool)
print('seln = ', seln)

>pool =  [[ 1 11 29  4 13]
 [29  1  2  3 24]
 [ 0 25 17  2 14]
 [20 22 18  9 29]]
seln =  [[ 8 12  0]
 [ 4 19 13]
 [ 8 15 24]
 [12 12 19]]

However, I am looking for a parallel method; handling all the rows at the same time, instead of one at a time in a loop.

Is this possible? If not numpy, how about Tensorflow?

SantoshGupta7
  • 5,607
  • 14
  • 58
  • 116
  • Have a look [here](https://stackoverflow.com/questions/14262654/numpy-get-random-set-of-rows-from-2d-array). Perhaps you can tweak some logic here – Sheldore Dec 30 '18 at 22:02
  • If the values were not to be shared among rows, you could have simply used [extract_patches_2d](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.image.extract_patches_2d.html). But you want the values for a given row to be randomly shuffled as well. This gives you the same order as in the original parent array – Sheldore Dec 30 '18 at 22:09
  • So for each row of the new array, you want to randomly select from the respective row from pool right? – yatu Dec 30 '18 at 22:25
  • 1
    @yatu: I guess that's what the OP meant when he/she wrote in the title "(values not shared among rows)" – Sheldore Dec 30 '18 at 22:31

1 Answers1

1

Here's a way avoiding for loops:

pool =  np.random.randint(0, 30, size=[4,5])
print(pool)
array([[ 4, 18,  0, 15,  9],
       [ 0,  9, 21, 26,  9],
       [16, 28, 11, 19, 24],
       [20,  6, 13,  2, 27]])

# New array shape
new_shape = (pool.shape[0],3)

# Indices where to randomly choose from
ix = np.random.choice(pool.shape[1], new_shape)
array([[0, 3, 3],
       [1, 1, 4],
       [2, 4, 4],
       [1, 2, 1]])

So ix's rows are each a set of random indices from which pool will be sampled. Now each row is scaled according to the shape of pool so that it can be sampled when flattened:

ixs = (ix.T + range(0,np.prod(pool.shape),pool.shape[1])).T
array([[ 0,  3,  3],
       [ 6,  6,  9],
       [12, 14, 14],
       [16, 17, 16]])

And ixs can be used to sample from pool with:

pool.flatten()[ixs].reshape(new_shape)
array([[ 4, 15, 15],
       [ 9,  9,  9],
       [11, 24, 24],
       [ 6, 13,  6]]) 
yatu
  • 86,083
  • 12
  • 84
  • 139
  • Your first row second row third row forth row entries in the pool does not correspond to the original array row values – Sheldore Dec 30 '18 at 23:04
  • 1
    Yes, it's just an example array, I also generated it randomly – yatu Dec 30 '18 at 23:05
  • Is it possible to have non-repeated values in each row? I tried altering the code to have `ix = np.random.choice(pool.shape[1], new_shape, replace=False)` but I get `ValueError: Cannot take a larger sample than population when 'replace=False'` – SantoshGupta7 Dec 30 '18 at 23:56