I have two datasets that look like this:
name Longitude Latitude continent
0 Aruba -69.982677 12.520880 North America
1 Afghanistan 66.004734 33.835231 Asia
2 Angola 17.537368 -12.293361 Africa
3 Anguilla -63.064989 18.223959 North America
4 Albania 20.049834 41.142450 Europe
And another dataset looks like this:
COUNTRY GDP (BILLIONS) CODE
0 Afghanistan 21.71 AFG
1 Albania 13.40 ALB
2 Algeria 227.80 DZA
3 American Samoa 0.75 ASM
4 Andorra 4.80 AND
Here, columns name
and COUNTRY
contains the country names but not in the same order.
How to combine the second dataframe into first one and add the CODE
columns to the first dataframe.
Required output:
name Longitude Latitude continent CODE
0 Aruba -69.982677 12.520880 North America NaN
1 Afghanistan 66.004734 33.835231 Asia AFG
2 Angola 17.537368 -12.293361 Africa NaN
3 Anguilla -63.064989 18.223959 North America NaN
4 Albania 20.049834 41.142450 Europe ALB
Attempt:
import numpy as np
import pandas as pd
df = pd.DataFrame({'name' : ['Aruba', 'Afghanistan', 'Angola', 'Anguilla', 'Albania'],
'Longitude' : [-69.982677, 66.004734, 17.537368, -63.064989, 20.049834],
'Latitude' : [12.520880, 33.835231, '-12.293361', 18.223959, 41.142450],
'continent' : ['North America','Asia','Africa','North America','Europe'] })
print(df)
df2 = pd.DataFrame({'COUNTRY' : ['Afghanistan', 'Albania', 'Algeria', 'American Samoa', 'Andorra'],
'GDP (BILLIONS)' : [21.71, 13.40, 227.80, 0.75, 4.80],
'CODE' : ['AFG', 'ALB', 'DZA', 'ASM', 'AND']})
print(df2)
pd.merge(left=df, right=df2,left_on='name',right_on='COUNTRY')
# but this fails