See edits below
I have a rather big matrix x
(3xn, n >> 1000) with limited information about the relation of each column.
From this matrix x
, I need to find the biggest angle and the corresponding indices of two columns.
Currently I'm using two for
-loops, which takes way to long, see code below.
I know that I can vectorize the norm of the vector, e.g. via np.linalg.norm(x, axis=0)
.
Also, I found some info on how to vectorize the dot product [here].
However, this will only calculate the dot product between each column, not between say columns 1 and 3.
Is there a way to optimize or even vectorize the problem?
import numpy as np
# Calculate the max angle between two vectors
len_x = 1000
x = np.random.rand(3,len_x) # 3xn matrix
ang_max = 0 # maximum angle
n_max = 0 # n-index of maximum angle
m_max = 0 # m-index of maximum angle
for n in range(len_x - 1):
vn = np.array([x[0][n], \
x[1][n], \
x[2][n]])
for m in range(n + 1, len_x):
vm = np.array([x[0][m], \
x[1][m], \
x[2][m]])
ang = np.arccos(np.dot(vn,vm)/(np.linalg.norm(vn)*np.linalg.norm(vm)))
if ang > ang_max:
ang_max = ang
n_max = n
m_max = m
Edit:
Using np.einsum
I was able to vectorize the inner loop away.
This improved the execution time by a factor of 44, at least in my test case (from 31.98 s down to 0.725 s).
This is a significant improvement, yet I feel that vectorizing the outer loop might give another speed-up by a similar factor.
Thus, I won't mark the question as answered, as log as the outer loop is not vectorized as well.
import numpy as np
# Calculate the max angle between two vectors
len_x = 1000
x = np.random.rand(3,len_x) # 3xn matrix
ang_max = 0 # maximum angle
n_max = 0 # n-index of maximum angle
m_max = 0 # m-index of maximum angle
vm = np.array(x)
vm_norm = np.linalg.norm(vm, axis=0)
for n in range(len_x - 1):
vn = np.array([float(x[n]) - xs, \
float(y[n]) - ys, \
float(z[n]) - zs])
ang = np.arccos(np.divide(np.einsum('ij,ij->j', vn[:,None], vm[:,:,0]),(np.linalg.norm(vn)*vm_norm)[:,0]))
maxang = max(ang)
if maxang > ang_max:
ang_max = maxang
n_max = n
m_max = np.where(ang == maxang)[0][0]