Using Python (3.7.7) and numpy (1.17.4), I am working with medium sized 2d numpy arrays (from 5000x80 up to 200,000x120). For a given array, I want to calculate the Hadamard product between all possbible uniqe pairs of column-vectors of that array.
I have:
A A
[a,b,c,d] [a,b,c,d]
[1,2,3,4] [1,2,3,4]
[4,5,6,7] * [4,5,6,7]
[7,8,9,1] [7,8,9,1]
and I want to get:
[a*b, ac, ad, bc, bd, cd]
[ 2., 3., 4., 6., 8., 12.]
[20., 24., 28., 30., 35., 42.]
[56., 63., 7., 72., 8., 9.]
I already have a solution from a colleague using np.kron which I adapated a bit:
def hadamard_kron(A: np.ndarray) -> :
"""Returns the hadamard products of all unique pairs of all columns,
and return indices signifying which columns constitute a given pair.
"""
n = raw_inputs.shape[0]
ind1 = (np.kron(np.arange(0, n).reshape((n, 1)), np.ones((n, 1)))).squeeze().astype(int)
ind2 = (np.kron(np.ones((n, 1)), np.arange(0, n).reshape((n, 1)))).squeeze().astype(int)
xmat2 = np.kron(raw_inputs, np.ones((n, 1))) * np.kron(np.ones((n, 1)), raw_inputs)
hadamard_inputs = xmat2[ind2 > ind1, :]
ind1_ = ind1[ind1 < ind2]
ind2_ = ind2[ind1 < ind2]
return hadamard_A, ind1_, ind2_
hadamard_A, first_pair_members, second_pair_members = hadamard_kron(a.transpose())
Note that hadamard_A is what I want, but transposed (which is also what I want for further processing). Also, ind1_ (ind2_) gives the indices for the objects which feature as the first (second) element in the pair for which the hadamard product is calculated. I need those as well.
However, I feel this code is too inefficient: it takes to long and since I call this function several times during my algorithm, I was wondering whether there is a cleverer solution? Am I overlooking some numpy/scipy tools I could cleverly combine for this task?
Thanks all! :)