There are much, much faster alternatives to using nested for
loops for this. I'll show you two different approaches - the first will be a more general method that will introduce you to broadcasting and vectorization, and the second uses a more convenient scipy library function.
- The general way, using broadcasting & vectorization
One of the first things I'd suggest doing is switching to using np.array
rather than np.matrix
. Arrays are preferred for a number of reasons, most importantly because they can have >2 dimensions, and they make element-wise multiplication much less awkward.
import numpy as np
ncoord = np.array(ncoord)
With an array, we can eliminate the nested for
loops by inserting a new singleton dimension and broadcasting the subtraction over it:
# indexing with None (or np.newaxis) inserts a new dimension of size 1
print(ncoord[:, :, None].shape)
# (20, 2, 1)
# by making the 'inner' dimensions equal to 1, i.e. (20, 2, 1) - (1, 2, 20),
# the subtraction is 'broadcast' over every pair of rows in ncoord
xydiff = ncoord[:, :, None] - ncoord[:, :, None].T
print(xydiff.shape)
# (20, 2, 20)
This is equivalent to looping over every pair of rows using nested for loops, but much, much faster!
xydiff2 = np.zeros((20, 2, 20), dtype=xydiff.dtype)
for ii in range(20):
for jj in range(20):
for kk in range(2):
xydiff[ii, kk, jj] = ncoords[ii, kk] - ncoords[jj, kk]
# check that these give the same result
print(np.all(xydiff == xydiff2))
# True
The rest we can also do using vectorized operations:
# we square the differences and sum over the 'middle' axis, equivalent to
# computing (x_i - x_j) ** 2 + (y_i - y_j) ** 2
ssdiff = (xydiff * xydiff).sum(1)
# finally we take the square root
D = np.sqrt(ssdiff)
The whole thing could be done in one line like this:
D = np.sqrt(((ncoord[:, :, None] - ncoord[:, :, None].T) ** 2).sum(1))
- The lazy way, using
pdist
It turns out that there's already a fast and convenient function for computing all pairwise distances: scipy.spatial.distance.pdist
.
from scipy.spatial.distance import pdist, squareform
d = pdist(ncoord)
# pdist just returns the upper triangle of the pairwise distance matrix. to get
# the whole (20, 20) array we can use squareform:
print(d.shape)
# (190,)
D2 = squareform(d)
print(D2.shape)
# (20, 20)
# check that the two methods are equivalent
print np.all(D == D2)
# True