3

I have some simple code and I just want to count the number of distinct columns in the product of two matrices. The code is

import numpy as np
import itertools

n = 5
h = 2
M =  np.random.randint(2, size=(h,n))
F = np.matrix(list(itertools.product([0,1],repeat = 5))).transpose()
product = M*F
setofcols = set()
for column in product.T:
    setofcols.add(column)
print len(setofcols)

However this gives the wrong answer as all the elements of setofcols are different even if the columns were the same. What is the right thing to do?

I will be running this with bigger values of n and h so a solution that uses as little memory as possible would be great.

marshall
  • 2,443
  • 7
  • 25
  • 45
  • Related: http://stackoverflow.com/questions/12983067/how-to-find-unique-vectors-of-a-2d-array-over-a-particular-axis-in-a-vectorized – Warren Weckesser Dec 11 '13 at 21:14
  • Do you mean with columns each containing unique content? – dawg Dec 11 '13 at 21:16
  • 1
    Use `setofcols.add(repr(column))` Answer is `10` – dawg Dec 11 '13 at 21:19
  • @alko I would have to change my code to use arrays for this. I tried F = np.array(list(itertools.product([0,1],repeat = 5))).transpose() but then I get TypeError: unhashable type: 'numpy.ndarray' – marshall Dec 11 '13 at 21:19
  • @marshall why? you can use solution in the thread I provided for matrices as well. or if case you doubt, use np.array(F*M) – alko Dec 11 '13 at 21:23
  • @dawg without specific seed? I got 6 with `len(set(map(repr, (M*F).T)))` – alko Dec 11 '13 at 21:25
  • @alko Oh I see, thanks. dawg's solution works for me as well. – marshall Dec 11 '13 at 21:26

1 Answers1

2

You can make yours work by using repr:

import numpy as np
import itertools

n = 5
h = 2
M =  np.random.randint(2, size=(h,n))
F = np.matrix(list(itertools.product([0,1],repeat = 5))).transpose()
product = M*F
setofcols = set()
for column in product.T:
    setofcols.add(repr(column))
print len(setofcols)
print setofcols

You can also do this:

setofcols={tuple(e.A1) for e in product.T}

Where the A1 attribute of a matrix is the 1d base array which can be used as a sequence for tuple.

dawg
  • 98,345
  • 23
  • 131
  • 206