Turn Numpy Array of Points into Numpy Array of Distances

Question

If we are given "starting_point" and "list_of_points", how do we create a new numpy array "distances" that contains the distance between the "starting_point" and each point in "list_of_points"?

I tried to do this by looping through "list_of_points" with the following code, but it did not work:

distances = sqrt( (list_of_points[num][0] - starting_point[0])^2 + list_of_points[num][1] - starting_point[1])^2 ) for num in range (0,4)

starting_point = np.array([1.0, 2.0])

list_of_points = np.array([-5.0, -3.0], [-4.0, 2.0], [7.0, 8.0], [6.0, -9.0])  

distances = np.array([ d1 ], [ d2 ], [ d3 ], [ d4 ])

The first problem with your code is that you have unbalanced parentheses, so the whole thing is a SyntaxError. The second problem is that `^2` doesn't mean squared, it means XOR 2; you want `**2`. Your third problem is that `list_of_points` isn't a valid array constructor. — abarnert, May 01 '18 at 00:59
Anyway, you rarely want to loop over things in numpy; that defeats the entire purpose of using it. You probably want something like `sqrt((list_of_points[:,0] - starting_point[0])** 2 + (list_of_points[:,1] - starting_point[1])**2)`. Or, better, look up `np.hypot` or `np.linalg.norm`, `scipy.spatial.distance`, etc. — abarnert, May 01 '18 at 01:04
If you do not use a loop, how do you iterate through all the "list_of_points" to create "distances"? — user9718320, May 01 '18 at 01:32
The whole point of numpy is that all of its operations are elementwise. You just do `list_of_points[:,0] - starting_point[0]` and it returns an array of all of the differences. If you don't get that, you need to read a basic numpy tutorial before you go any further. — abarnert, May 01 '18 at 01:35
You can also use einsum it is extremely flexible. there are numerous examples... https://stackoverflow.com/questions/46571624/sorting-points-from-distance-to-a-given-point-x-y-here-in-my-case-x-0-y-o/46574290#46574290 for example — NaN, May 01 '18 at 02:45

score 0 · Answer 1 · answered May 01 '18 at 02:12

You are on the right track with using Numpy for this. I personally found Numpy very unintuitive when I first used it, but it gets (a little) easier with practice.

The basic idea is that you want to avoid loops and use vectorized operations. This allows much faster operations on large data structures.

On part of vectorization is broadcasting — where Numpy can apply operations over differently shaped objects. So in this case you can subtract without looping:

import numpy as np
starting_point = np.array([1.0, 2.0])
list_of_points = np.array([[4, 6], [5, 5], [3, 2]]) 

# subtract starting_point from each point in list_of_points
dif = list_of_points - starting_point

If you dig around in the docs you'll find all sorts of vectorized operations including np.linalg.norm() (docs) which calculates different kinds of norms including distances. The trick to using this is to figure out which axis to use. I've changes the values to make it easy to confirm the correct answers:

import numpy as np
starting_point = np.array([1.0, 2.0])
list_of_points = np.array([[4, 6], [5, 5], [3, 2]])
np.linalg.norm(starting_point - list_of_points, axis=1)

# array([ 5.,  5.,  2.])

You can also do it the hard way by squaring, summing and taking the square root if you want to:

np.sqrt(
    np.sum(
        np.square(list_of_points - starting_point),
    axis = 1)
)
# array([ 5.,  5.,  2.])

Turn Numpy Array of Points into Numpy Array of Distances

1 Answers1