4

Given a json file,

{"BusStopCode": "00481", "RoadName": "Woodlands Rd", "Description": "BT PANJANG TEMP BUS PK", "Latitude": 1.383764, "Longitude": 103.7583},
{"BusStopCode": "01012", "RoadName": "Victoria St", "Description": "Hotel Grand Pacific", "Latitude": 1.29684825487647, "Longitude": 103.85253591654006}

, and so on..

of various bus stops, I am trying to find the nearest bus stops based on this list of 5000 bus stops with any user given lat/long using the given formula

import math
R = 6371000 #radius of the Earth in m
x = (lon2 - lon1) * cos(0.5*(lat2+lat1)) 
y = (lat2 - lat1) 
d = R * sqrt( x*x + y*y ) 

My question would be, for user input of lat1 and lon1, how would i be able to compute all distances between lat1 lon1 and lat2 lon2 (where lat2 lon2 will take the value of all 5000 lat/lon in json file), and then print the lowest 5 distances?

I have thought of using list.sort but am not sure of how i am able to compute all 5000 distances using python.

Thank you so much.

Edit:

With the code from Eric Duminil, the following code works for my needs.

from math import cos, sqrt
import sys
import json
busstops = json.loads(open("stops.json").read())
R = 6371000 #radius of the Earth in m 
def distance(lon1, lat1, lon2, lat2): 
  x = (lon2-lon1) * cos(0.5*(lat2+lat1)) 
  y = (lat2-lat1) 
  return R * sqrt( x*x + y*y )
buslist = sorted(busstops, key= lambda d: distance(d["Longitude"], d["Latitude"], 103.5, 1.2))
print(buslist[:5])

where 103.5, 1.2 from buslist is an example user input longitude latitude.

Joey Ngo
  • 111
  • 4
  • 9
  • Just check out https://pypi.python.org/pypi/geopy. You might find a way to get some sort of way. Also check this comment https://stackoverflow.com/a/19412565/6165783 – Manoj Jadhav Oct 09 '17 at 08:20
  • Thank you for the links. However I am not concerned with the calculation of the distance between 2 lat/lon, but rather - with a user given lat/lon, how can i find the 5 nearest bus stops with a json file of 5000 bus stops. – Joey Ngo Oct 09 '17 at 08:29
  • apologies if my question is not phrased well – Joey Ngo Oct 09 '17 at 08:30

2 Answers2

3

You could simply define a function to calculate the distance and use it to sort bus stops with the key argument:

from math import cos, sqrt, pi

R = 6371000 #radius of the Earth in m
def distance(lon1, lat1, lon2, lat2):
    x = (lon2 - lon1) * cos(0.5*(lat2+lat1))
    y = (lat2 - lat1)
    return (2*pi*R/360) * sqrt( x*x + y*y )

bustops = [{"BusStopCode": "00481", "RoadName": "Woodlands Rd", "Description": "BT PANJANG TEMP BUS PK", "Latitude": 1.383764, "Longitude": 103.7583},
{"BusStopCode": "01012", "RoadName": "Victoria St", "Description": "Hotel Grand Pacific", "Latitude": 1.29684825487647, "Longitude": 103.85253591654006}]

print(sorted(bustops, key= lambda d: distance(d["Longitude"], d["Latitude"], 103.5, 1.2)))
# [{'BusStopCode': '01012', 'RoadName': 'Victoria St', 'Description': 'Hotel Grand Pacific', 'Latitude': 1.29684825487647, 'Longitude': 103.85253591654006}, {'BusStopCode': '00481', 'RoadName': 'Woodlands Rd', 'Description': 'BT PANJANG TEMP BUS PK', 'Latitude': 1.383764, 'Longitude': 103.7583}]

Once this list is sorted, you can simply extract the 5 closest bus stops with [:5]. It should be fast enough, even with 5000 bus stops.

Note that if you don't care about the specific distance but only want to sort bus stops, you could use this function as key:

def distance2(lon1, lat1, lon2, lat2):
    x = (lon2 - lon1) * cos(0.5*(lat2+lat1))
    y = (lat2 - lat1)
    return x*x + y*y
Eric Duminil
  • 52,989
  • 9
  • 71
  • 124
1

I've done the same for such a project, but calculating all the distances for a large dataset can take a lot of time.

I ended up with knn nearest neighbors which is much faster and you don't need to recalculate the distance all the time:

import numpy as np
from sklearn.neighbors import NearestNeighbors

buslist = [{ ...., 'latitude':45.5, 'longitude':7.6}, { ...., 'latitude':48.532, 'longitude':7.451}]

buslist_coords = np.array([[x['latitude'], x['longitude']] for x in buslist]) #extracting x,y coordinates

# training the knn with the xy coordinates
knn = NearestNeighbors(n_neighbors=num_connections)
knn.fit(buslist_coords)
distances, indices = knn.kneighbors(xy_coordinates)
# you can pickle these and load them later to determinate the nearest point to an user


# finding the nearest point for a given coordinate
userlocation = [47.456, 6.25]
userlocation = np.array([[userlocation[0], userlocation[1]]])
distances, indices = knn.kneighbors(userlocation)

# get the 5 nearest stations in a list
nearest_stations = buslist[indices[0][:5]] # the order of the buslist must be the same when training the knn and finding the nearest point

# printing the 5 nearest stations
for station in nearest_stations :
    print(station)

After that, I built a graph with networkx with these data, but I'm still using knn.kneighbors(userlocation) to find the nearest point of an user.

bky
  • 1,314
  • 3
  • 11
  • 14