I'm probably not seeing something obvious here but don't believe np.apply_along_axis
or np.apply_over_axes
is what I'm looking for. Say I have the following two arrays:
arr1 = np.random.randn(10, 5)
arr2 = np.random.randn(10, )
And the following function:
def coefs(x, y):
return np.dot(np.linalg.inv(np.dot(x.T, x)), np.dot(x.T, y))
# the vector of coefficients in a multiple linear regression
Calling this on arr1
and arr2
works smoothly as it should:
coefs(arr1, arr2)
Out[111]: array([-0.19474836, -0.50797551, 0.82903805, 0.06332607, -0.26985597])
However, suppose instead of the 1- or 2d arrays I have two 3d arrays:
arr3 = np.array([arr1[:-1], arr1[1:]])
arr4 = np.array([arr2[:-1], arr2[1:]])
As expected, if I apply the function here I get
coefs(arr3, arr4)
Traceback (most recent call last):
File "<ipython-input-127-4a3e7df02cda>", line 1, in <module>
coefs(arr3, arr4)
File "<ipython-input-124-7532b8516784>", line 2, in coefs
return np.dot(np.linalg.inv(np.dot(x.T, x)), np.dot(x.T, y))
ValueError: shapes (5,9,2) and (2,9,5) not aligned: 2 (dim 2) != 9 (dim 1)
...because NumPy is treating each array as an object as it should. What I want to do instead is apply the coefs()
function to each of the 2 elements along the 0 axis of the arrays, element-wise. Here's a crude way of doing this:
tgt = []
for i, j in zip(arr3, arr4):
tgt.append(coefs(i, j))
np.array(tgt)
Out[136]:
array([[-0.34328006, -0.99116672, 1.42757897, -0.06687851, -0.44669182],
[ 0.44494495, -0.58017705, 0.75825944, 0.18795889, 0.4560851 ]])
My question is, is there a more efficient and pythonic way of doing this than using zip and iterating over, as above? Basically, given two input arrays of shape (2, n, k) and (2, n), I want the array that is returned to be of shape (2, k). Thanks.