Let x
be an unevenly spaced strictly increasing array (e.g., x = np.array([0, .1, .3, .7, .99])
). I have the following spline functions:
from scipy.interpolate import InterpolatedUnivariateSpline
a_ = InterpolatedUnivariateSpline(x, a)
b_ = InterpolatedUnivariateSpline(x, b)
c_ = InterpolatedUnivariateSpline(x, c)
d_ = InterpolatedUnivariateSpline(x, d)
e_ = InterpolatedUnivariateSpline(x, e)
f_ = InterpolatedUnivariateSpline(x, f)
g_ = InterpolatedUnivariateSpline(x, g)
w_ = InterpolatedUnivariateSpline(x, w)
which are built from numpy arrays a,b,c,d,e,f,g,w
of the same length as x
.
Let w
be an array with shape (2,len(x))
. I am now doing the following loop:
y1 = []
y2 = []
for ii in range(len(x)):
xi = x[ii]
A_1 = np.array([[ a_(xi), -a_(xi)],
[ 0, -b_(xi)]])
B_1 = np.array([[ -a_(xi), 0],
[ b_(xi), 0]])
C_1 = np.array([[ 1, 0],
[ c_(xi), 0]])
D_1 = np.array([[ 1, 0],
[ -d_(xi), 1]])
D_2 = np.array([[ -1, 0],
[ 0, e(xi)]])
Amat = A_1 + B_1 @ np.linalg.inv(D_1) @ C_1
Bmat = B_1 @ np.linalg.inv(D_1) @ D_2
Cmat = np.linalg.inv(D_1) @ C_1
Dmat = np.linalg.inv(D_1) @ D_2
K1 = np.array([f_(xi), g_(xi)])
y1 += [-Amat.T@w - Cmat.T@K1]
y2 += [ Dmat.T@K1 + Bmat.T@w ]
This works, however, it is very slow, as x
is large and I need to do this loop many times.
My feeling is that I should be able to vectorize all of this code and it will be very fast. However, when I try to do so, I run into problems with the @
matrix multiplications:
MemoryError: Unable to allocate array with shape (4800, 4800, 4800) and data type float64
Here is my attempt, with the code that produces that error:
nil = np.zeros(len(x))
A_1 = np.array([[ a_(x), -a_(x)],
[ nil, -b_(x)]])
B_1 = np.array([[-a_(x), nil],
[ b_(x), nil]])
C_1 = np.array([[ 1+nil, nil],
[ c_(x), nil]])
D_1 = np.array([[ 1+nil, nil],
[-d_(x), 1+nil]])
D_2 = np.array([[ nil-1, nil],
[ nil, e_(x)]])
Amat = A_1 + B_1 @ np.linalg.pinv(D_1) @ C_1
Bmat = B_1 @ np.linalg.pinv(D_1) @ D_2
Cmat = np.linalg.pinv(D_1) @ C_1
Dmat = np.linalg.pinv(D_1) @ D_2
K1 = np.array([f_(x), g_(x)])
Cmat.T@K1 # produces the error
At a basic level I can move the creation of the A_1, etc matrices outside of the loop, and that speeds it up. However, is it possible to speed it up even more?