How do I merge two datasets based on the common key in pandas?

Question

I have two datasets that contains domain names:

df1:

varA     domains            
123     www.google.com   
456     www.ebay.com     
789     www.amazon.com   
101     www.nbc.com      
....

df2:

 urls            varB
www.cnn.com      xsd
www.ebay.com     wer
www.nbc.com      xyz
www.amazon.com   zyx
....

I need to populate urls values in df2 with varA values from df1 for the matching domains/urls, so the output would look like this:

 urls            varA   varB
www.ebay.com     456    wer
www.nbc.com      101    xyz
www.amazon.com   789    zyx
....

All of the domains in df2 that do not have a matching domain in df1 should be removed.

I have this code:

target_cols = ['domains', 'urls', 'varB', 'varA']
df2.merge(df1[target_cols], on='urls', how='inner')

The code is generating an error.

How do I fix it? Any alternative solutions that can work?

This particular question is answered by the section in the linked duplicate target, under the section "Avoiding duplicate key column in output". — cs95, Dec 07 '18 at 11:32

score 3 · Accepted Answer · answered Mar 12 '17 at 23:23

3

The error is because keys on which you are merging do not have same name This will work

pd.merge(df1, df2, left_on = 'domains', right_on = 'urls', how = 'inner').drop('domains', axis = 1)


    varA    urls            varB
0   456     www.ebay.com    wer
1   789     www.amazon.com  zyx
2   101     www.nbc.com     xyz

answered Mar 12 '17 at 23:23

Vaishali

37,545
5
58
86

Partially worked, varB did not get transferred – Feyzi Bagirov Mar 13 '17 at 02:19
Are you not getting the same output as what I printed? – Vaishali Mar 13 '17 at 02:22
only varA and urls, varB is not in the output – Feyzi Bagirov Mar 13 '17 at 02:29
The merge will merge all the columns as long as they are in the original df. I can't get what happened to varB. Can you once again print df1 and df2 and check. I am pretty confident that this code works – Vaishali Mar 13 '17 at 02:36
You are correct, my apologies - was trying to print the misspelled variable. It is working! – Feyzi Bagirov Mar 13 '17 at 02:51

How do I merge two datasets based on the common key in pandas?

1 Answers1