0

I am unable to compare the column values of 2 different dataframes.

First dataset has 500 rows and second dataset has 128 rows. I am mentioning few of the rows of datasets.

First dataset:

Country_name    Weather   President
 USA              16       Trump 
 China            19       Xi

2nd dataset

Country_name    Weather   Currency
 North Korea      26       NKT 
 China            19       Yaun

I want to compare the country_name column because I don't have Currency column in dataset 1 , so if the country_name matches, then I can append its value. My final dataframe should be like this

Country_name  Weather  President  Currency
USA             16      Trump      Dollar
China           19      Xi         Yaun

In the above final dataframes, we have to include only those countries whose country_name is present in both the datasets and corresponding value of Currency should be appended as shown above.

Rishavv
  • 301
  • 1
  • 6
  • 2
    but `USA` doesn't appear in both data frames? Shouldn't the final result be: `China 19 Xi Yun` ?! – Yahya Jan 05 '21 at 18:55

1 Answers1

1

If you just want to keep records that only match in Country_name, and execlude everything else, you can then use the merge function, which basically finds the intersection between two dataframes based on some given column:

d1 = pd.DataFrame(data=[['USA', 16, 'Trump'], ['China', 19, 'Xi']],
                      columns=['Country_name', 'Weather', 'President'])
d2 = pd.DataFrame(data=[['North Korea', 26, 'NKT'], ['China', 19, 'Yun']],
                      columns=['Country_name', 'Weather', 'Currency'])

result = pd.merge(d1, d2, on=['Country_name'], how='inner')\
        .rename(columns={'Weather_x': 'Weather'}).drop(['Weather_y'], axis=1)
print(result) 

Output

  Country_name  Weather President Currency
0        China       19        Xi      Yun
Yahya
  • 13,349
  • 6
  • 30
  • 42