How to efficiently compute an L2 distance between rows of two array using only basic numpy operations?

Question

I am rather new to python and I am doing an exercise which is supposed to show that writing vectorized code using numpy is faster than executing loops over arrays.

The task is to find the L2 distance between two large arrays X with dimensions (500,3072) and self.X_train with dimensions (5000,3072) using only basic numpy operations (e.g. no use of np.linalg.norm() is allowed).

I wrote three functions for finding L2 using: two loops, one loop, and no loops.

def compute_distances_two_loops(self, X):
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in range(num_test):
        for j in range(num_train):
            dists[i,j] = np.sqrt(np.sum(np.square(X[i,:]- 
            self.X_train[j,:])))
    return dists

def compute_distances_one_loop(self, X):
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in range(num_test):
        dists[i,:] = np.sqrt(np.sum(np.square(X[i,:]- 
        self.X_train),axis=1))
    return dists

def compute_distances_no_loops(self, X):
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    dists = np.sqrt(np.sum((X[:,np.newaxis,:]-self.X_train)**2,axis=2))
    return dists

But when I time the execution time, I get the following:

Two loop version took 2.453091 seconds;

One loop version took 1.544929 seconds;

No loop version took 2.020762 seconds.

I assume something is wrong with the way I vectorized the code. Could somebody suggest what I can do better? There was a hint in the exercise "Try to formulate the L2 distance using matrix multiplication and two broadcast sums." But I am new to broadcasting and can't figure out how to use it here.

I looked at other questions on finding L2 distance, but none of them needs to execute everything using only basic numpy.

Thank you!

There's [`cdist`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html), [sklearn's version](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise_distances.html), [`eucl_dist`](https://github.com/droyed/eucl_dist) ( disclaimer: I am its author). Relevant posts - https://stackoverflow.com/questions/47877530/, https://stackoverflow.com/questions/1871536/. — Divakar, Aug 28 '19 at 10:00
Thank you but I am not allowed to use any special functions. The main problem I have is that the version without loops is extremely inefficient and I don't understand why. — Evgeniia Edeleva, Aug 28 '19 at 10:13
The vectorized one is probably running into memory issues. For a NumPy only one - https://stackoverflow.com/a/47877630/ and `eucl_dist` one could help. — Divakar, Aug 28 '19 at 10:15
Thank you! It actually is fast now that I use np.sqrt(np.sum(t ** 2,axis=1)[:,None] + np.sum(w ** 2,axis=1) - 2*t.dot(w.T)) to calculate the distance between t and w. But why is this so much fast than using np.sqrt(np.sum((t[:,None,:]-w)**2,axis=2)) ? — Evgeniia Edeleva, Aug 28 '19 at 10:33
The latter one is extending dimensions to `3D`, thus causing heavy memory traffic. When working with vectorized codes, esoecially that deal with array-programming one need to keep those in mind. The former one is still keep things at 2D and also leverage BLAS with that `.dot` to make it faster. — Divakar, Aug 28 '19 at 10:41

How to efficiently compute an L2 distance between rows of two array using only basic numpy operations?

0 Answers0