2

say I have a (3,3,3) array like this.

array([[[1, 1, 1],
        [1, 1, 1],
        [0, 0, 0]],

       [[2, 2, 2],
        [2, 2, 2],
        [2, 2, 2]],

       [[3, 3, 3],
        [3, 3, 3],
        [1, 1, 1]]])

How do I get the 9 values corresponding to euclidean distance between each vector of 3 values and the zeroth values?

Such as doing a numpy.linalg.norm([1,1,1] - [1,1,1]) 2 times, and then doing norm([0,0,0] - [0,0,0]), and then norm([2,2,2] - [1,1,1]) 2 times, norm([2,2,2] - [0,0,0]), then norm([3,3,3] - [1,1,1]) 2 times, and finally norm([1,1,1] - [0,0,0]).

Any good ways to vectorize this? I want to store the distances in a (3,3,1) matrix.

The result would be:

array([[[0. ],
        [0. ],
        [0. ]],

       [[1.73],
        [1.73],
        [3.46]]

       [[3.46],
        [3.46],
        [1.73]]])
chimpsarehungry
  • 1,775
  • 2
  • 17
  • 28
  • Yes, unfortunately, `norm` doesn't allow an `axis` arg. I don't know why. You might find the answer you're looking for in this [similar question](http://stackoverflow.com/questions/7741878/how-to-apply-numpy-linalg-norm-to-each-row-of-a-matrix) – shx2 May 10 '13 at 04:17

3 Answers3

2

keepdims argument is added in numpy 1.7, you can use it to keep the sum axis:

np.sum((x - [1, 1, 1])**2, axis=-1, keepdims=True)**0.5

the result is:

[[[ 0.        ]
  [ 0.        ]
  [ 0.        ]]

 [[ 1.73205081]
  [ 1.73205081]
  [ 1.73205081]]

 [[ 3.46410162]
  [ 3.46410162]
  [ 0.        ]]]

Edit

np.sum((x - x[0])**2, axis=-1, keepdims=True)**0.5

the result is:

array([[[ 0.        ],
        [ 0.        ],
        [ 0.        ]],

       [[ 1.73205081],
        [ 1.73205081],
        [ 3.46410162]],

       [[ 3.46410162],
        [ 3.46410162],
        [ 1.73205081]]])
HYRY
  • 94,853
  • 25
  • 187
  • 187
  • Thanks for your help. You are probably not far off from the answer; I edited the question to show how I need the zeroth values as in arr[0], not just arr[0][0]. – chimpsarehungry May 10 '13 at 13:33
  • @chimpsarehungry, I edited the answer, just replace `[1,1,1]` to `x[0]` and you will get the result. – HYRY May 10 '13 at 22:00
1

You might want to consider scipy.spatial.distance.cdist(), which efficiently computes distances between pairs of points in two collections of inputs (with a standard euclidean metric, among others). Here's example code:

import numpy as np
import scipy.spatial.distance as dist

i = np.array([[[1, 1, 1],
               [1, 1, 1],
               [0, 0, 0]],
              [[2, 2, 2],
               [2, 2, 2],
               [2, 2, 2]],
              [[3, 3, 3],
               [3, 3, 3],
               [1, 1, 1]]])
n,m,o = i.shape

# compute euclidean distances of each vector to the origin
# reshape input array to 2-D, as required by cdist
# only keep diagonal, as cdist computes all pairwise distances
# reshape result, adapting it to input array and required output
d = dist.cdist(i.reshape(n*m,o),i[0]).reshape(n,m,o).diagonal(axis1=2).reshape(n,m,1)

d holds:

array([[[ 0.        ],
        [ 0.        ],
        [ 0.        ]],

       [[ 1.73205081],
        [ 1.73205081],
        [ 3.46410162]],

       [[ 3.46410162],
        [ 3.46410162],
        [ 1.73205081]]])

The big caveat of this approach is that we're calculating n*m*o distances, when we only need n*m (and that it involves an insane amount of reshaping).

fgb
  • 3,009
  • 1
  • 18
  • 23
0

I'm doing something similar that is to compute the the sum of squared distances (SSD) for each pair of frames in video volume. I think that it could be helpful for you.

video_volume is a a single 4d numpy array. This array should have dimensions (time, rows, cols, 3) and dtype np.uint8.

Output is a square 2d numpy array of dtype float. output[i,j] should contain the SSD between frames i and j.

video_volume = video_volume.astype(float)
size_t = video_volume.shape[0]
output = np.zeros((size_t, size_t), dtype = np.float)
for i in range(size_t):
    for j in range(size_t):
        output[i, j] = np.square(video_volume[i,:,:,:] - video_volume[j,:,:,:]).sum()