I am rather new to python and I am doing an exercise which is supposed to show that writing vectorized code using numpy is faster than executing loops over arrays.
The task is to find the L2 distance between two large arrays X with dimensions (500,3072) and self.X_train with dimensions (5000,3072) using only basic numpy operations (e.g. no use of np.linalg.norm() is allowed).
I wrote three functions for finding L2 using: two loops, one loop, and no loops.
def compute_distances_two_loops(self, X):
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
for i in range(num_test):
for j in range(num_train):
dists[i,j] = np.sqrt(np.sum(np.square(X[i,:]-
self.X_train[j,:])))
return dists
def compute_distances_one_loop(self, X):
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
for i in range(num_test):
dists[i,:] = np.sqrt(np.sum(np.square(X[i,:]-
self.X_train),axis=1))
return dists
def compute_distances_no_loops(self, X):
num_test = X.shape[0]
num_train = self.X_train.shape[0]
dists = np.zeros((num_test, num_train))
dists = np.sqrt(np.sum((X[:,np.newaxis,:]-self.X_train)**2,axis=2))
return dists
But when I time the execution time, I get the following:
Two loop version took 2.453091 seconds;
One loop version took 1.544929 seconds;
No loop version took 2.020762 seconds.
I assume something is wrong with the way I vectorized the code. Could somebody suggest what I can do better? There was a hint in the exercise "Try to formulate the L2 distance using matrix multiplication and two broadcast sums." But I am new to broadcasting and can't figure out how to use it here.
I looked at other questions on finding L2 distance, but none of them needs to execute everything using only basic numpy.
Thank you!