1

I currently have a dataframe A consisting of a column (code1) of country codes such as CA, RU, US etc. I have another dataframe B that has 3 columns where the first column has all possible country codes, the second has a longitude value and the third has a latitude value. I'm trying to loop through the A, get the first country code in the first column, match it to the country code in the first column of B and then get the associated longitude and latitude of that country and so forth. I plan to create a new dataframe containing the codes from A (the first column) and the newly extracted longitude and latitude values.

So far my function looks as follows

def get_coords():
    for i in range(len(A["code1"])):
        for j in range(len(B["code"])):
            if A["code1"[i] = B["code"[j]: #if the country codes match
                latitude = B["lat"][j] #gets the latitude of the matched country code
                longitude = B["long"][j] #gets the longitude

However, this seems to be inefficient and I'm not sure if it is even matching the codes from the dataframes correctly. Is there a better method of going about what I am trying to achieve?

For reference len(A["code1"]) = 581 and len(B["code"] = 5142

Here is a sample input of data:

A = pd.DataFrame({'code1': ['US', 
                   'RU', 'AO', 'ZW']})

B = pd.DataFrame({'code': ['US', 'ZW', 'RU', 'YE', 'AO'], 
                   'long': [65.216000, 65.216000,18.500000,-63.032000,19.952000], 'lat': [12.500000, 33.677000,-12.500000,18.237000,60.198000]})

I am trying to have the output look like

A = pd.DataFrame({'code1': ['US', 'RU', 'AO', 'ZW'], 'longitude':[65.216000,18.500000, 19.952000, 65.216000], 'latitude': [12.500000, -12.500000, 60.198000, 33.677000]})
Angie
  • 183
  • 3
  • 13

2 Answers2

1

use pd.merge and specify the left_on column to merge on as well as the right_on column, since the two column you want to merge have different column names. Then, .drop the excess column that you don't need.

A = pd.merge(A,B,how='left',left_on='code1',right_on='code').drop(['code'], axis=1)

output:

    code1   long        lat
0   US      65.216      12.500
1   RU      18.500      -12.500
2   AO      19.952      60.198
3   ZW      65.216      33.677
David Erickson
  • 16,433
  • 2
  • 19
  • 35
  • Thank you! I'm curious, would this also work if A had 2 columns instead of 1? For example code1 and code2 and I was matching both columns to B to get a 6 column dataframe instead of 3 columns? For example the output could be ```A = pd.DataFrame({'code1': ['US', 'RU', 'AO', 'ZW'], 'longitude1':[65.216000,18.500000, 19.952000, 65.216000], 'latitude1': [12.500000, -12.500000, 60.198000, 33.677000], 'code2': ['ZW','RU','US','AO',], 'longitude2':[65.216000,18.500000,65.216000,19.952000], 'latitude2': [33.677000,-12.500000,12.500000,60.198000]})``` – Angie Jul 17 '20 at 00:44
  • @Angie yes it would wourk; however, this is what the output would be: `A = pd.DataFrame({'code1': ['US', 'RU', 'AO', 'ZW'], 'longitude1_x':[65.216000,18.500000, 19.952000, 65.216000], 'latitude1_x': [12.500000, -12.500000, 60.198000, 33.677000], 'code2': ['ZW','RU','US','AO',], 'longitude1_y':[65.216000,18.500000,65.216000,19.952000], 'latitude_y': [33.677000,-12.500000,12.500000,60.198000]}) A` .... unless see next comment. – David Erickson Jul 17 '20 at 00:57
  • ... you pass `, suffixes=('1', '2')` to control what you want your suffixes to be: for example, `A = pd.merge(A,B,how='left',left_on='code1',right_on='code', suffixes=('1', '2'))` – David Erickson Jul 17 '20 at 00:58
0
n [108]: A = pd.DataFrame({'code1': ['US',
     ...:                    'RU', 'AO', 'ZW']})

In [109]: B = pd.DataFrame({'code': ['US', 'ZW', 'RU', 'YE', 'AO'],
     ...:                    'long': [65.216000, 65.216000,18.500000,-63.032000,19.952000], 'lat': [12.500000, 33.67700
     ...: 0,-12.500000,18.237000,60.198000]})

In [110]: A.rename({"code1":"code"},axis=1,inplace=True)

In [111]: A = pd.merge(A,B, on="code").rename({"code":"code1"},axis=1)

In [112]: A
Out[112]:
  code1    long     lat
0    US  65.216  12.500
1    RU  18.500 -12.500
2    AO  19.952  60.198
3    ZW  65.216  33.677
bigbounty
  • 16,526
  • 5
  • 37
  • 65