0

I have the following two dataframes. Call this df1

    City              Latitude       Longitude
0   NewYorkCity       40.7128        74.0060
1   Chicago            41.8781       87.6298
2   LA                34.0522        118.2437
3   Paris             48.8566        2.3522

and call this one df2

    Place      Latitude      Longitude
0   75631      26.78436      -80.103
1   89210      26,75347      -80.0192

I want to know how I can calculate the distance between place and all cities listed. So it should look something like this.

    Place      Latitude      Longitude     NewYorkCity    Chicago     Paris
0   75631      26.78436      -80.103       some number     .....         ....
1   89210      26,75347      -80.0192      some number      ....        ....

I'm reading through this particular post and attempting to adapt:Pandas Latitude-Longitude to distance between successive rows

def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
   
    
    if to_radians:
        lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])

    a = np.sin((lat2-lat1)/2.0)**2 + \
        np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2

    return earth_radius * 2 * np.arcsin(np.sqrt(a))


df['dist'] = haversine(df1.Latitude, df.Longitude, df2.Latitude, df2.Longitude)

I know this looks wrong. Am I needing a for loop to go through each of the ones in df1?
bosois
  • 21
  • 4
  • To simplify your question. Do you wish to know the distance between place 75631 and NewYorkCity, Chicago etc? – adhg Dec 08 '20 at 00:21
  • I'm not sure why you say it looks wrong. You will create an output table with one column for each row in df1. Then, in nested loops, for each row in df2, you'll fill the columns with your haversine computation for each location in df1. – Tim Roberts Dec 08 '20 at 01:04
  • @adhg Yeah correct! My idea now is to calculate the distance between the cities one at a time and then attach them all at the end. Definitely loops. – bosois Dec 08 '20 at 01:43

2 Answers2

0
a=df.iloc[:,1::].values#Array of the Lat/Long
b=df2.iloc[:,1::].values##Array of the Lat/Long
df.join(pd.DataFrame(distance.cdist(a, b,  'euclidean')).rename(columns={0:75631,1:89210}))



    City  Latitude  Longitude       75631       89210
0  NewYorkCity   40.7128    74.0060  154.737149  154.656475
1      Chicago   41.8781    87.6298  168.410550  168.329860
2           LA   34.0522   118.2437  198.479810  198.397200
3        Paris   48.8566     2.3522   85.358326   85.285379

Alternatively and which is a long way

df2.rename(columns={'Latitude':'Lat','Longitude':'Long'}, inplace=True)#rename Lat/long in df2
g=pd.concat([df,df2.iloc[:1]], axis=1).fillna(method='ffill')#Append 1st Place on df
h=h=pd.concat([df,df2.iloc[1:]], axis=1).ffill().bfill()#append 2nd place on df
l=g.append(h)#new dataframe
#Compute diatnce

    u=l.Latitude.sub(l.Lat)
    v=l.Longitude.sub(l.Long)
    l['dist'] = np.sqrt(u**2+v**2)
    print(l)
     
    City  Latitude  Longitude    Place       Lat     Long        dist
0  NewYorkCity   40.7128    74.0060  75631.0  26.78436 -80.1030  154.737149
1      Chicago   41.8781    87.6298  75631.0  26.78436 -80.1030  168.410550
2           LA   34.0522   118.2437  75631.0  26.78436 -80.1030  198.479810
3        Paris   48.8566     2.3522  75631.0  26.78436 -80.1030   85.358326
0  NewYorkCity   40.7128    74.0060  89210.0  26.75347 -80.0192  154.656475
1      Chicago   41.8781    87.6298  89210.0  26.75347 -80.0192  168.329860
2           LA   34.0522   118.2437  89210.0  26.75347 -80.0192  198.397200
3        Paris   48.8566     2.3522  89210.0  26.75347 -80.0192   85.285379
wwnde
  • 26,119
  • 6
  • 18
  • 32
  • I think this is a fine solution if you had a small amount of latitude and longitude points to calculate from. But say the cities dataframe is 20 points and the place dataframe is 15,000, this would be cumbersome to type out 20 pieces. – bosois Dec 08 '20 at 19:32
0

The following code worked for me:

a=list(range(19))

for i in a:
    Lat1=df1[i,2] #works down 3rd column
    Lon1=df1[i,3] #works down 4th column
    Lat2=df2['Latitude']
    Lon2= df2['Longitude']

    #the i in the below piece works down the 1st column to grab names
    #the code then places them into column names

    df2[df1iloc[i,0]] = 3958.756*np.arccos(np.cos(math.radians(90-Lat1)) *np.cos(np.radians(90-Lat2)) +np.sin(math.radians(90-Lat1)) *np.sin(np.radians(90-Lat2)) *np.cos(np.radians(Lon1-Lon2)))

Note that this calculates the miles between each location as direct shots there. Doesn't factor in twists and turns.

bosois
  • 21
  • 4