2

Given a list of lat/lon points, how can we find the minimum number of 50-mile radius circles (and their lat/lon points) such that these circles cover all the points in the list?

The solution does not need to be optimal, and the calculation of radiuses/distances can be approximated, for simplicity. Or use a helper library like geopy.distance.

For example, here is a CSV list of lat/lon points:

41.81014,-72.550028
41.995833,-72.581525
41.377211,-72.150307
41.710626,-72.763862
41.55254,-72.815454
41.415022,-73.401914
41.0554,-73.54142
41.660572,-72.725673
41.350949,-72.871673
41.280278,-72.987515
41.23354,-73.151677
41.235174,-73.038092
41.58254,-73.034321
41.89121,-72.6521
41.340446,-73.078943
41.81886,-73.0755
41.228735,-73.225326
41.839019,-71.883778
41.585192,-71.99693
41.611472,-72.901357
41.783976,-72.748229
43.634242,-70.347774
44.842191,-68.74156
43.934038,-69.985271
43.474,-70.5141
44.312403,-69.804993
42.552616,-70.937616
41.877743,-71.068577
41.940344,-71.351931
42.399035,-71.071855
42.168221,-72.642232
42.518609,-71.135461
42.160827,-71.498868
42.481583,-71.024154
42.305328,-71.398387
42.29247,-71.7751
41.796058,-71.321145
42.376695,-71.090028
42.364178,-71.156462
41.971125,-70.716858
42.280435,-71.655929
42.359487,-71.607159
42.503468,-70.919421
42.194395,-71.774687
42.357311,-72.547241
42.328872,-71.062845
42.033714,-71.310581
42.39976,-71.000326
42.527193,-71.71374
42.495264,-73.206116
41.63729,-71.003268
42.110519,-70.927683
42.152383,-71.073541
42.02714,-71.1438
42.740784,-71.161323
41.773672,-70.745562
42.788072,-71.115959
42.623622,-71.318304
42.137401,-70.83883
42.348748,-71.504967
41.749066,-71.207427
42.2045,-71.1553
42.22142,-71.021844
42.589718,-71.159895
42.344172,-71.099961
42.364561,-71.102575
42.2882972,-71.1267483
42.350679,-71.114022
42.494932,-71.103401
42.42072,-71.09902
42.388648,-71.118659
42.484104,-71.186185
41.666927,-70.294616
42.275401,-71.029299
42.299241,-71.062748
42.361045,-71.0626
42.764475,-71.215039
43.2189,-71.485199
42.702771,-71.437791
43.045615,-71.461202
42.79899,-71.53679
42.941002,-71.473513
42.928188,-72.301906
43.235048,-70.884519
43.048951,-70.818587
43.633682,-72.322002
44.466154,-73.18226
Athena Wisdom
  • 6,101
  • 9
  • 36
  • 60

1 Answers1

1

Updated answer based on comments:

You have many options.

Here are 3 differents ways of doing that:

1. With scipy.CKDTree:

Pros :

  • This will be fast

Cons :

  • less accurate because the computed distance is euclidean
  • and the radius will be the same as your inputs, so here in degrees

I would go with a cKDTree and a radius query to find all points in radius, remove theses points from list, and continue with remaining points. This is not optimal but can be a good basis.

from scipy.spatial import cKDTree

points = [(41.81014,-72.550028), (41.995833,-72.581525), (41.377211,-72.150307), (41.710626,-72.763862), (41.55254,-72.815454), (41.415022,-73.401914), (41.0554,-73.54142), (41.660572,-72.725673), (41.350949,-72.871673), (41.280278,-72.987515), (41.23354,-73.151677), (41.235174,-73.038092), (41.58254,-73.034321), (41.89121,-72.6521), (41.340446,-73.078943), (41.81886,-73.0755), (41.228735,-73.225326), (41.839019,-71.883778), (41.585192,-71.99693), (41.611472,-72.901357), (41.783976,-72.748229), (43.634242,-70.347774), (44.842191,-68.74156), (43.934038,-69.985271), (43.474,-70.5141), (44.312403,-69.804993), (42.552616,-70.937616), (41.877743,-71.068577), (41.940344,-71.351931), (42.399035,-71.071855), (42.168221,-72.642232), (42.518609,-71.135461), (42.160827,-71.498868), (42.481583,-71.024154), (42.305328,-71.398387), (42.29247,-71.7751), (41.796058,-71.321145), (42.376695,-71.090028), (42.364178,-71.156462), (41.971125,-70.716858), (42.280435,-71.655929), (42.359487,-71.607159), (42.503468,-70.919421), (42.194395,-71.774687), (42.357311,-72.547241), (42.328872,-71.062845), (42.033714,-71.310581), (42.39976,-71.000326), (42.527193,-71.71374), (42.495264,-73.206116), (41.63729,-71.003268), (42.110519,-70.927683), (42.152383,-71.073541), (42.02714,-71.1438), (42.740784,-71.161323), (41.773672,-70.745562), (42.788072,-71.115959), (42.623622,-71.318304), (42.137401,-70.83883), (42.348748,-71.504967), (41.749066,-71.207427), (42.2045,-71.1553), (42.22142,-71.021844), (42.589718,-71.159895), (42.344172,-71.099961), (42.364561,-71.102575), (42.2882972,-71.1267483), (42.350679,-71.114022), (42.494932,-71.103401), (42.42072,-71.09902), (42.388648,-71.118659), (42.484104,-71.186185), (41.666927,-70.294616), (42.275401,-71.029299), (42.299241,-71.062748), (42.361045,-71.0626), (42.764475,-71.215039), (43.2189,-71.485199), (42.702771,-71.437791), (43.045615,-71.461202), (42.79899,-71.53679), (42.941002,-71.473513), (42.928188,-72.301906), (43.235048,-70.884519), (43.048951,-70.818587), (43.633682,-72.322002), (44.466154,-73.18226)]

# Radius of circle. Note that the unit is the same as in your list (here, degrees.)
radius = 1

num_circles = 0

list_is_no_empty = True

while(list_is_no_empty):

    # Take the first point in order to find all points within distance radius
    start_point = points[0]

    # Create a KDTree
    tree = cKDTree(points)

    # Find indexes of all points in radius
    indexes_of_points_in_radius = tree.query_ball_point(start_point, radius)

    # Create the list of points to remove (points that were found within distance radius)
    points_to_remove = [points[i] for i in indexes_of_points_in_radius]

    # Remove these points
    points = list(set(points) - set(points_to_remove))

    # Increment the number of circles
    num_circles += 1

    # If no points remain, exit loop
    if points == []:
        list_is_no_empty = False

print("Number of circles:", num_circles)

2. With sklearn.neighbors.BallTree:

Pros:

  • This will be more accurate because the computed distance here is the great-circle distance (Haversine formula).

Cons:

  • Like the cKDTree, the radius will be the same as your inputs, so here in degrees.
  • A little slower than scipy.cKDTree (2 times slower when I tested)

Note too that I found that some recommend to convert your inputs in radians because that is required for the Haversine formula (https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.DistanceMetric.html). But from my testings, with scikit-learn 1.0.1, that wasn't needed. But just in case, you would do:

from math import radians
points = [tuple(map(radians, point)) for point in points]
start_point = (radians(start_point[0]), radians(start_point[1]))
radius = radians(radius)

Code with BallTree:

from sklearn.neighbors import BallTree
import numpy as np

num_circles = 0

list_is_no_empty = True

while(list_is_no_empty):

    # Take the first point in order to find all points within distance radius
    start_point = np.array([points[0]])

    # Create a BallTree, and chose the Haversine formula (great circle distance)
    tree = BallTree(points, metric="haversine")

    # Find indexes of all points in radius
    indexes_of_points_in_radius = tree.query_radius(start_point, r=radius)[0]

    # Create the list of points to remove (points that were found within distance radius)
    points_to_remove = [points[i] for i in indexes_of_points_in_radius]

    # Remove these points
    points = list(set(points) - set(points_to_remove))

    # Increment the number of circles
    num_circles += 1

    # If no points remain, exit loop
    if points == []:
        list_is_no_empty = False

print("Number of circles:", num_circles)

3. With sklearn.neighbors.BallTree, using a user-defined distance function:

Pros :

  • We will be able here to use a very accurate distance
  • We will be able to specify this distance in miles (or meters)

Cons:

  • Way slower than cKDTree (10 times when I tested)
from pyproj import Geod
from sklearn.neighbors import BallTree
import numpy as np

# Create a WGS84 ellipsoid
geod = Geod(ellps='WGS84')

# User-defined function for BallTree
# We use the "inv" method of pyproj in order to get the distance in meters between 2 points
# It computes the geodesic distance using the wonderful Karney's algorithm
def geodedsic_distance(point_01, point_02):
    lat1,lon1 = point_01
    lat2,lon2 = point_02
    az12,az21,distance_in_meters = geod.inv(lon1,lat1,lon2,lat2)
    return distance_in_meters

def miles_to_meters(miles):
    return miles * 1609.344

# Radius in miles
radius_in_miles = 50

radius_in_meters = miles_to_meters(50)

num_circles = 0

list_is_no_empty = True

while(list_is_no_empty):

    # Take the first point in order to find all points within distance radius
    start_point = np.array([points[0]])

    # Create a BallTree, and chose our custom function
    tree = BallTree(points, metric=geodedsic_distance)

    # Find indexes of all points in radius, specified in meters
    indexes_of_points_in_radius = tree.query_radius(start_point, r=radius_in_meters)[0]

    # Create the list of points to remove (points that were found within distance radius)
    points_to_remove = [points[i] for i in indexes_of_points_in_radius]

    # Remove these points
    points = list(set(points) - set(points_to_remove))

    # Increment the number of circles
    num_circles += 1

    # If no points remain, exit loop
    if points == []:
        list_is_no_empty = False

print("Number of circles:", num_circles)

If you want to learn more about miles to degrees conversion (and why, in fact, we can't) and computing distances on earth:

Is the Haversine Formula or the Vincenty's Formula better for calculating distance?

https://gis.stackexchange.com/questions/84885/difference-between-vincenty-and-great-circle-distance-calculations

Rivers
  • 1,783
  • 1
  • 8
  • 27
  • Looks good! How do you convert the radius of circle from 50 miles to be assigned to the `radius` variable? Do we have to use `BallTree` with `metric='haversine'` since we are dealing with latitude/longitude coordinates? – Athena Wisdom Nov 21 '21 at 18:45
  • That it is not as simple as it looks like. That's about (at least) geodesy and great circle distance. That's a whole new question and it would be impossible to answer it in a comment. I'll be glad to answer it if you post it (for example "How to convert miles in degrees in Python?") – Rivers Nov 21 '21 at 19:11
  • I think that `BallTree` is part of Scikit-Learn, not SciPy, but yes, in this case harvesine would be the metric to chose instead of euclidean. – Rivers Nov 21 '21 at 19:25
  • For speed I would reproject to a local coordinate system to work in euclidean space – Ian Turton Nov 21 '21 at 19:40
  • Got it working with lat/lon by converting the points to radians using `numpy.deg2rad` and using `Balltree(points, metric="haversine")` Thanks! – Athena Wisdom Nov 21 '21 at 20:27
  • To get a (approximation) of radius to use with haversine; distance_in_meters = 5000 earth_radius = 6371000 radius = distance_in_meters / earth_radius From: https://stackoverflow.com/questions/66470898/finding-pairs-of-latitude-and-longitude-within-a-certain-radius-in-python – Willem Hendriks Nov 23 '21 at 12:10