I know my question is very vague but I'm still learning. For context, we have 3 dataframe:
users: UID, originlongitude, originlatitude, destinationlongitude, destinationlatitude
Bus: BusID, Name
BusStops: BusID, BusStopID, longtitude, latitude
I would like to find the busses each user can take base on the criteria that the origin and destination of a user is at least 0.5km away from any bus stop of a bus.
I have coded the following in Python but I am very unsure if this is the right approach in terms of efficiency (run time) of the code, as well as correctness (whether it works).
import geopy.distance as geo
User['Busses'] = []
for userindex, userrow in User.iterrows():
routes = []
olat = userrow['OriginLatitute']
olong = userrow['OriginLongitute']
ocoor = (olat, olong)
dlat = userrow['DestinationLatitude']
dlong = userrow['DestinationLongitude']
dcoor = (dlat, dlong)
for routeindex, routerow in Bus.iterrows():
flag = [False, False]
Routeid = routerow['BusID']
Routestops = RouteStop[RouteStop['BusID'] == Routeid]
for rsindex, rsrow in BusStops.iterrows():
rlat = rsrow['Latitude']
rlong = rsrow['Longitude']
rcoor = (rlat, rlong)
origindist = geo.distance(ocoor, rcoor).km
destdist = geo.distance(dcoor, rcoor).km
if origindist < 0.5:
flag[0] = True
if destdist < 0.5:
flag[1] = True
if False not in flag:
routes.append(rsrow['BusID'])
break
User.set_values(userindex, 'Busses', routes)
From how i see it, the runtime seems to be almost n^3 which might not be ideal. Is there a better solution to this problem? Would appreciate any help whether is it improving the runtime or correcting the code.