I'm looking for a commandline solution to find the nearest sets of points from a list of CSV coordinates.
Here this was answered for Excel, but I need a somewhat different solution.
I'm NOT looking for the nearest point for every point, but for the point pairs with least distance from each other.
I would like to match many power plants from GEO, so a (python?) commandline tool would be great.
Here is an example dataset:
Chicoasén Dam,16.941064,-93.100828
Tuxpan Oil Power Plant,21.014891,-97.334492
Petacalco Coal Power Plant,17.983575,-102.115252
Angostura Dam,16.401226,-92.778926
Tula Oil Power Plant,20.055825,-99.276857
Carbon II Coal Power Plant,28.467176,-100.698559
Laguna Verde Nuclear Power Plant,19.719095,-96.406347
Carbón I Coal Power Plant,28.485238,-100.69096
Manzanillo I Oil Power Plant,19.027372,-104.319274
Tamazunchale Gas Power Plant,21.311282,-98.756266
The tool should print "Carbon II" and "Carbon I", because this pair has the minimal distance.
A code fragment could be:
from math import radians, cos, sin, asin, sqrt
import csv
def haversine(lon1, lat1, lon2, lat2):
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
km = 6371 * c
return km
with open('mexico-test.csv', newline='') as csvfile:
so = csv.reader(csvfile, delimiter=',', quotechar='|')
data = []
for row in so:
data.append(row)
print(haversine(28.467176,-100.698559,28.485238,-100.69096))