0

How can I efficiently iterate over all h by n arrays made of 0s and 1s where all the rows and all the columns are distinct? At the moment I do this.

h = 10
n = 15
hxn = np.arange(h*n).reshape(h, -1)
for i in xrange(0, 2**(h*n)):
    M = (i >> hxn) & 1
#DO WORK

But this includes lots of 2d arrays with rows or columns that are the same. I also don't care about the order of the rows or columns.


I don't want to just test each M to see if it has a duplicate row or column and discard those as that will be hugely inefficient. I would like to find a way just to iterate over the much smaller number of matrices with no duplicated row or column.

marshall
  • 2,443
  • 7
  • 25
  • 45
  • Are you trying to first identify whether or not a matrix's rows and cols are distinct and THEN operate on them, or perform an operation on each distinct row and/or col in a given matrix? – a p Feb 14 '14 at 21:57
  • @ap Neither of those. I want to iterate over a subset of all h by n the matrices. That is just the matrices have no duplicated row or column. – marshall Feb 14 '14 at 22:17
  • Are you sure it's hugely inefficient? What shape of matrix are you expecting? – user2357112 Feb 14 '14 at 22:22
  • @user2357112 The shape is h by n. The total number of binary 10 by 15 matrices is 2^(150) in this case. But the number of binary 10 by 15 matrices with no duplicated rows of columns is far smaller. – marshall Feb 14 '14 at 22:24
  • Actually, it's not. A very rough upper bound says that at most 10.4% of 10 by 15 binary matrices have duplicate rows or columns. – user2357112 Feb 14 '14 at 22:26
  • Of course, iterating through all `<= 0.896 * 2**150` 10 by 15 0-1 matrices with no duplicate rows or columns is still a prohibitively expensive task. You may want to tackle your problem from a different angle. – user2357112 Feb 14 '14 at 22:29
  • @user2357112 If we just look at 10 by 15 matrices with no duplicate rows there are 2^15 choose 10 of them which is roughly 3.9 * 10^(38) – marshall Feb 14 '14 at 22:32
  • No, you can't just choose 10, because order matters. (If order doesn't matter, that's a different story.) You're missing a factor of `10!`. – user2357112 Feb 14 '14 at 22:34
  • You still have at least 2.7*10^26 matrices to go through. Is that a scale you can handle? – user2357112 Feb 14 '14 at 22:42

1 Answers1

0

You can set your new matrice M and calculations with Numpy's sum() function like this;

import numpy as np
h = 10
n = 15

M = np.zeros( (h,n), dtype ='float64' ) # create matrice M dimensioned h,n with floats
                                        # if you want integers instead of floats, change
                                        # dtype to 'int32'
hxn = np.arange(h*n).reshape(h, -1)

for i in xrange(0,h):   # loop in range h starting from 0
    for j in range(0,n):  # loop in range n from 0
        M[i,j] = sum([(i*j)*2])   # set matrice row and column, M[i,j] by sum([]) 

        print M
user1749431
  • 559
  • 6
  • 21
  • This seems to imply there are only h*n matrices with no duplicated rows or columns but that isn't right. – marshall Feb 14 '14 at 22:20
  • Right, there are no duplicated rows created with M=np.zeros( (h,n), dtype ='float64' ), it sets the rows and columns strictly from the data in h and n. It totals all the occurences of h and n so if two occurences of the same, for instance h = 5 n =5 occurs twice it will give 2 in the column so you don't have to add *2 unless you want to multipy it two occurences times 2. – user1749431 Feb 14 '14 at 22:27
  • The M's aren't binary. They have to contain only 0 or 1. – marshall Feb 14 '14 at 22:29
  • then M[i,j] += 1 instead of M[i,j] = sum([(i*j)*2]), it will return 1 in the columns where h and n co-occur and 0 in the rest. – user1749431 Feb 14 '14 at 22:33