-2

I am trying to implement Python code when given the names and GPS positions of 750 people (latitude, longitude and elevation) to find the names of the 10 closest neighbors of a randomly selected individual.

import random
    #random = rand.sample(range(0,750), 10)
    coords = [(random.random()*2.0, random.random()*2.0, random.random()*2.0) for _ in range(750)]
41 72 6c
  • 1,600
  • 5
  • 19
  • 30
Tonikami04
  • 177
  • 1
  • 2
  • 14
  • It would be useful to specify units and the preferred method of distance calculations. –  Jul 19 '19 at 19:51
  • 1
    Are the coordinates close together? (within some small bounding-box of lon-lat, e.g. New York City) or the entire world? We can't tell from your code because there aren't any units. Then you need to decide if your distance metric is circular, Cartesian (or Manhattan). But please show us some sample actual coords, say five. – smci Jul 20 '19 at 04:23

3 Answers3

1

To do this you should either work in spherical coordinates, or you can convert to Cartesian. Working in Cartesian makes the assumption that direct distance, and not a great elliptic arc, is how you are measuring distance.

import numpy as np
from sklearn.neighbors import DistanceMetric

R = 6371 # approximate radius of earth in km

# coordinates in (lat,lon,elv) in units of (rad,rad,km)
coords = np.random.random((750, 3)) * 2
cart_coords = np.array([((R+coord[2]) * np.cos(coord[0]) * np.cos(coord[1]),
                         (R+coord[2]) * np.cos(coord[0]) * np.sin(coord[1]),
                         (R+coord[2]) *np.sin(coord[0])) for coord in coords])

# calculate distances between points
dist = DistanceMetric.get_metric('euclidean')
dist_vals = dist.pairwise(cart_coords)

# pick a random person
random_person = np.random.choice(np.arange(750))
top_ten = np.where(dist_vals[random_person] < sorted(dist_vals[random_person])[11])[0]
# remove self from list
top_ten = top_ten[top_ten!=random_person]

print(top_ten)

If you wished to ignore the elevation and use the havesine formula, you can check this post Vectorizing Haversine distance calculation in Python

The Earth is an ellipsoid with a difference of about 21km between the polar and equatorial radii. If you really want to go deeper you can look into the science of geodesy. astropy is a good package for this type of problem https://docs.astropy.org/en/stable/api/astropy.coordinates.spherical_to_cartesian.html

0

Couldn't you just use the distance formula to calculate the distance between two points given x,y,z, where d=sqrt((x2-x1)^2+(y2-y1)^2+(z2-z1)^2) to get the distance between the randomly selected person and all other elements. Just calculate the distances of every single person from the random person and then only store the ten lowest values

v0rtex
  • 59
  • 1
  • 8
  • The data isn't in Cartesian coordinates. Since lat/lon coordinates wrap around on themselves, that won't work without preparing the data first. –  Jul 22 '19 at 01:45
0

You could use the excellent BallTree from sklearn:

import numpy as np
from sklearn.neighbors import BallTree

coords = np.random.random((750, 3)) * 2
tree = BallTree(coords)
random_person = np.random.choice(np.arange(750))
closest_people = tree.query(coords[None, random_person], k=10)[1]
Dan Perera
  • 144
  • 1
  • 3
  • BallTree does have a 'haversine' distance metric, but it's a 2 dimensional lat/lon measurement and won't work with altitude as far as I'm aware. – Dan Perera Jul 19 '19 at 14:19
  • There is a possibility of problems until the some care is taken with respect to the coordinate system and units being used. For instance, the 180th meridian goes through Fiji, and that will pose a problem for this code. –  Jul 19 '19 at 14:20
  • It isn't just the haversine issue. In the problem (assuming degrees for lat/lon) it is comparing up to a few hundred km in two of the dimensions with something unitless in the radial direction. –  Jul 19 '19 at 14:33