0

I have a df with 3 people ("members") and I would like to measure the distance of these people from 3 locations. The end result would be a df ranking the 3 locations from closest to farther for all 3 people. I have the geocoordinates for all 3 people and all 3 locations and this is what I've tried so far but I don't know how to finish the loop to concatenate the frames to a master frame. Please help!:

df = []
df_2 = []

for m in range(len(members)):

    df_temp_member = pd.DataFrame({'member_id': members.iloc[[m]]['member_id']
                                   })

    for s in range(len(locations)):
        dist = haversine(lon1 = members.iloc[[m]]['longitude']
                        ,lat1 = members.iloc[[m]]['latitude']
                        ,lon2 = locations.iloc[[s]]['Longitude']
                        ,lat2 = locations.iloc[[s]]['Latitude'])

        df_temp = pd.DataFrame({'location_name': locations.iloc[[s]]['location_name'],
                                'Distance': dist,
                                })

        df.append(df_temp)

    df = pd.concat(df)
    df = df.sort_values(by='Distance', ascending=True, na_position='first').reset_index(drop = True).reset_index(drop = True)

    df_temp_1 = pd.DataFrame({'location_1': df.iloc[[0]]['location_name'],
                              'Distance_1': df.iloc[[0]]['Distance'],
                               })

    df_temp_2 = pd.DataFrame({'location_2': df.iloc[[1]]['location_name'].reset_index(drop = True),
                              'Distance_2': df.iloc[[1]]['Distance'].reset_index(drop = True),
                               })

    df_temp_3 = pd.DataFrame({'location_3': df.iloc[[2]]['location_name'].reset_index(drop = True),
                              'Distance_3': df.iloc[[2]]['Distance'].reset_index(drop = True),
                               })
    
    frames = [df_temp_1, df_temp_2, df_temp_3]

    df_2 = pd.concat(frames, axis = 1)
June Smith
  • 163
  • 1
  • 11
  • Please include samples of `members` and `locations` that can be used for testing as well as your expected output for the __provided__ input data. See [MRE - Minimal, Reproducible, Example](https://stackoverflow.com/help/minimal-reproducible-example), and [How to make good reproducible pandas examples](https://stackoverflow.com/q/20109391/15497888) for more information. – Henry Ecker May 31 '21 at 17:12

1 Answers1

0

Sample data*:

>>> members
      Name   Longitude   Latitude
0   Sherie   16.196499  44.040776
1    Cathi  107.000799  -7.018167
2  Grissel  118.152148  24.722747

>>> locations
       Location   Longitude   Latitude
0     Quarteira   -8.098960  37.102928
1       Weishan  100.307174  25.227212
2  Šuto Orizare   21.429841  41.992429

Haversine function** especially modified for Series:

def haversine_series(sr):
    lon1, lat1, lon2, lat2 = sr[["Longitude1", "Latitude1", "Longitude2", "Latitude2"]]
    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])
    dlon = lon2 - lon1
    dlat = lat2 - lat1
    a = np.sin(dlat / 2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon / 2.0)**2
    c = 2 * np.arcsin(np.sqrt(a))
    km = 6371 * c
    return km

Cross members and locations dataframes and compute distance:

distances = members.merge(locations, how="cross", suffixes=('1', '2'))
distances["Distance"] = distances.apply(haversine_series, axis="columns")
>>> distances
      Name  Longitude1  Latitude1      Location  Longitude2  Latitude2      Distance
0   Sherie   16.196499  44.040776     Quarteira   -8.098960  37.102928   2182.362810
1   Sherie   16.196499  44.040776       Weishan  100.307174  25.227212   7640.729330
2   Sherie   16.196499  44.040776  Šuto Orizare   21.429841  41.992429    482.470815
3    Cathi  107.000799  -7.018167     Quarteira   -8.098960  37.102928  12695.443489
4    Cathi  107.000799  -7.018167       Weishan  100.307174  25.227212   3657.950305
5    Cathi  107.000799  -7.018167  Šuto Orizare   21.429841  41.992429  10165.429008
6  Grissel  118.152148  24.722747     Quarteira   -8.098960  37.102928  11135.298789
7  Grissel  118.152148  24.722747       Weishan  100.307174  25.227212   1798.285195
8  Grissel  118.152148  24.722747  Šuto Orizare   21.429841  41.992429   8719.611566

Rank :

>>> distances.pivot(index="Location", columns="Name", values="Distance") \
             .rank(axis="columns").astype(int)
Name          Cathi  Grissel  Sherie
Location
Quarteira         3        2       1
Weishan           2        1       3
Šuto Orizare      3        2       1

Credits:

* Data generated from Mockaroo

** Source from https://stackoverflow.com/a/25767765/15239951

Corralien
  • 109,409
  • 8
  • 28
  • 52