6

Given, two 3-D arrays of dimensions (2,2,2):

A = [[[ 0,  0],
    [92, 92]],

   [[ 0, 92],
    [ 0, 92]]]

B = [[[ 0,  0],
    [92,  0]],

   [[ 0, 92],
    [92, 92]]]

How do you find the Euclidean distance for each vector in A and B efficiently?

I have tried for-loops but these are slow, and I'm working with 3-D arrays in the order of (>>2, >>2, 2).

Ultimately I want a matrix of the form:

C = [[d1, d2],
     [d3, d4]]

Edit:

I've tried the following loop, but the biggest issue with it is that loses the dimensions I want to keep. But the distances are correct.

[numpy.sqrt((A[row, col][0] - B[row, col][0])**2 + (B[row, col][1] -A[row, col][1])**2) for row in range(2) for col in range(2)]
Divakar
  • 218,885
  • 19
  • 262
  • 358
user1658296
  • 1,398
  • 2
  • 18
  • 46

3 Answers3

5

Thinking in a NumPy vectorized way that would be performing element-wise differentiation, squaring and summing along the last axis and finally getting square root. So, the straight-forward implementation would be -

np.sqrt(((A - B)**2).sum(-1))

We could perform the squaring and summing along the last axis in one go with np.einsum and thus make it more efficient, like so -

subs = A - B
out = np.sqrt(np.einsum('ijk,ijk->ij',subs,subs))

Another alternative with numexpr module -

import numexpr as ne
np.sqrt(ne.evaluate('sum((A-B)**2,2)'))

Since, we are working with a length of 2 along the last axis, we could just slice those and feed it to evaluate method. Please note that slicing isn't possible inside the evaluate string. So, the modified implementation would be -

a0 = A[...,0]
a1 = A[...,1]
b0 = B[...,0]
b1 = B[...,1]
out = ne.evaluate('sqrt((a0-b0)**2 + (a1-b1)**2)')

Runtime test

Function definitions -

def sqrt_sum_sq_based(A,B):
    return np.sqrt(((A - B)**2).sum(-1))

def einsum_based(A,B):
    subs = A - B
    return np.sqrt(np.einsum('ijk,ijk->ij',subs,subs))

def numexpr_based(A,B):
    return np.sqrt(ne.evaluate('sum((A-B)**2,2)'))

def numexpr_based_with_slicing(A,B):
    a0 = A[...,0]
    a1 = A[...,1]
    b0 = B[...,0]
    b1 = B[...,1]
    return ne.evaluate('sqrt((a0-b0)**2 + (a1-b1)**2)')

Timings -

In [288]: # Setup input arrays
     ...: dim = 2
     ...: N = 1000
     ...: A = np.random.rand(N,N,dim)
     ...: B = np.random.rand(N,N,dim)
     ...: 

In [289]: %timeit sqrt_sum_sq_based(A,B)
10 loops, best of 3: 40.9 ms per loop

In [290]: %timeit einsum_based(A,B)
10 loops, best of 3: 22.9 ms per loop

In [291]: %timeit numexpr_based(A,B)
10 loops, best of 3: 18.7 ms per loop

In [292]: %timeit numexpr_based_with_slicing(A,B)
100 loops, best of 3: 8.23 ms per loop

In [293]: %timeit np.linalg.norm(A-B, axis=-1) #@dnalow's soln
10 loops, best of 3: 45 ms per loop
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • Would you have links or resources that compare `np.einsum` vs regular numpy efficiency ? – jadsq Oct 29 '16 at 13:45
  • 1
    @jadsq I guess [this one](http://stackoverflow.com/questions/18365073/why-is-numpys-einsum-faster-than-numpys-built-in-functions) if we are talking about a general case. On this specific case, we are doing two things in one go with `einsum` and that makes it really beneficial here. – Divakar Oct 29 '16 at 14:04
  • 1
    @Divakar, thank you! I've learned again something new from you today. – MaxU - stand with Ukraine Oct 29 '16 at 14:31
  • numexpr is much faster with slicing. My arrays are quite large though: (8464, 8464, 2), and I find the computation still too slow. – user1658296 Nov 14 '16 at 11:31
  • @user1658296 How much time is it taking? – Divakar Nov 14 '16 at 11:34
  • I'm getting approximately 2.5 seconds, which is far better than linalg which on average was taking ~ 13 seconds. – user1658296 Nov 14 '16 at 11:40
  • I have a vector of shape `(1,10,512)`, I need to calculate its euclidean from matrix of shape `(20000,1,512)`. The required shape `(20000,10)`. How do I use numexpr based slicing to my advantage? – mibrahimy Mar 15 '19 at 11:24
5

for completeness:

np.linalg.norm(A-B, axis=-1)
dnalow
  • 974
  • 4
  • 14
0

I recommend being extremely careful when using custom squares and root instead of standard builtin math.hypot and np.hypot. These are fast and optimized and very safe.

In that sense np.linalg.norm(A-B, axis=-1) here looks safest

For very large matrices, numpy using broadcast will become memory bound and slowdown. In such cases you would want to use for loops but dont compromise on speed. For that using numba will be good here

i, j, k = 1e+200, 1e+200, 1e+200
math.hypot(i, j, k)
# 1.7320508075688773e+200

Refer: speed/overflow/underflow

np.sqrt(np.sum((np.array([i, j, k])) ** 2))
# RuntimeWarning: overflow encountered in square
eroot163pi
  • 1,791
  • 1
  • 11
  • 23