sample entries from a matrix while satisfying a given requirement

Question

There is a 0-1 matrix, I need to sample M different entries of 1 value from this matrix. Are there any efficient Python implements for this kind of requirement?

A baseline approach is having M iterations, during each iteration, randomly sample 1, if it is of value 1, then keep it and save its position, otherwise, continue this iteration until find entry with value 1; and continue to next iteration. It seems not a good heuristic at all.

What are you working with here? What is a matrix? A list of lists? Some `numpy` data structure? — juanpa.arrivillaga, Jan 27 '17 at 17:58
do really want one at a time? http://stackoverflow.com/q/17385419/6876009 or maybe fast methods to get a list of all http://stackoverflow.com/q/432112/6876009 — f5r5e5d, Jan 27 '17 at 18:26
@junapa.arrivillaga, it is a two-dimensional numpy array, generated from an image, having only white and black color. — user288609, Jan 27 '17 at 18:39

score 0 · Accepted Answer · answered Jan 27 '17 at 20:45

I chose to indirectly index the return from numpy.nonzero

by using pop() on ndx_ndx list to get one (indirect) index into the input array without replacement

eventually ndx_ndx will be emptied when you have gotten all of the ones

import numpy as np


ary = np.random.randint(2, size=(20, 20))

# get the indices of all of the ones

ndx_ary_ones = np.nonzero(ary)

# make a range list for pointing into ndx_ary_ones

ndx_ndx = list(range(len(ndx_ary_ones[0])))

# randomize the order

np.random.shuffle(ndx_ndx)

# pop the last ndx_ndx

a_ran_ndx_ndx = ndx_ndx.pop()

# get the index tuple for the one in ary that we removed from ndx_ndx

a_ran_one_ndx = (ndx_ary_ones[0][a_ran_ndx_ndx],
                 ndx_ary_ones[1][a_ran_ndx_ndx])

# testing...

print('ary', ary, '\n')
print('ndx_ary_ones ', *ndx_ary_ones, sep = '\n')
print('\n','ndx_ndx[0:10] ', ndx_ndx[0:10], '\n')

for _ in range (10):
    a_ran_ndx_ndx = ndx_ndx.pop()
    a_ran_one_ndx = (ndx_ary_ones[0][a_ran_ndx_ndx],
                     ndx_ary_ones[1][a_ran_ndx_ndx])
    print(a_ran_one_ndx, ary[a_ran_one_ndx])

ary [[0 0 0 ..., 1 1 1]
 [0 1 1 ..., 1 1 1]
 [1 0 0 ..., 1 0 1]
 ..., 
 [1 1 0 ..., 1 0 1]
 [1 1 0 ..., 1 1 1]
 [1 0 0 ..., 0 0 1]] 

ndx_ary_ones 
[ 0  0  0 ..., 19 19 19]
[ 3  5  7 ..., 14 15 19]

 ndx_ndx[0:10]  [121, 43, 146, 69, 64, 3, 29, 186, 98, 30] 

(7, 12) 1
(8, 18) 1
(0, 3) 1
(10, 2) 1
(18, 18) 1
(17, 7) 1
(15, 14) 1
(4, 11) 1
(10, 1) 1
(4, 4) 1

score 0 · Answer 2 · answered Jan 28 '17 at 10:42

We can do it in the following manner: first get all the (x,y) tuples (indices) of the matrix A where A[x,y]=1. Let there be k such indices. Now roll a k-sided unbiased dice M times (we can simulate by using function randint(1,k) drawing sample from uniform distribution). If you want samples with replacements (same position of the matrix can be chosen multiple times) then it can be done with M invocations of the function. Otherwise for samples with replacements (without repetitions allowed) you need to keep track of positions already selected and delete those indices from the array before throwing the dive next time.

sample entries from a matrix while satisfying a given requirement

2 Answers2