How to add a column from one dataframe to another dataframe based on multiple colums

Question

Suppose if I've two dataframes df1 and df2.

Df1: ID First_name Second_Name dob
Outputt

Sam. Daniel. 1/28/1997
Dave. Cullen. 3/18/1997
David. Bell. 2/18/2000

Df2: ID First_name Second_Name dob. Output

Sam. Daniel. 1/28/1997 1
Dave. Cullen. 3/18/1997 0
David. Bell. 2/18/2000 1

Result

Df1 ID First_name Second_Name dob Output

Sam. Daniel. 1/28/1997 1
Dave. Cullen. 3/18/1997 0
David. Bell. 2/18/2000 1

As you can see, I want to add the output column from df2 to df1 based on ID, even if the ID is wrong in this case David who has 301 ID in df2 and 3 in df1 still I want the output as 1, since the person has the same first_name, last_name and Dob.

I've tried adding output based on just ID, using

Df3 = df1.join(df2.set_index('ID'), on='ID', how = 'left')

This worked for me,but this was only considering ID, I want to add output to the df1 based on the first_name, last_name and dob just incase if the ID is wrong.

Can someone help me with this?

Do you need `Df3 = df1.merge(df2, on=['First_name','Second_Name',''dob], how = 'left')` ? — jezrael, Feb 04 '21 at 07:30
What about ID? That's my primary checker, incase if ID is wrong then I want to merge using, first_name, second_name, dob, so this way, I won't lose any data, that's the only reason, but will this work? — Salee, Feb 04 '21 at 07:38
Sorry, then not sure if undertand, mainly because in sample data are same ID 1,2,3. So not possible test this scenario. Is possible change data sample for this? — jezrael, Feb 04 '21 at 07:42

How to add a column from one dataframe to another dataframe based on multiple colums

0 Answers0