I have a dataset with some employee information, and I would like to see if certain records appear in another DataFrame. However, there may be duplicate IDs (I know...), so I wanted to filter where the ID AND date of birth are the same. I tried doing it with a merge, but then all the columns get added, which I don't want. How should I go about this?
Example data:
df1 = pd.DataFrame({"ID": [1, 2, 3, 4, 5], "DOB": ["1987-12-03", "1993-04-05", "2000-01-24", "1995-05-18", "1974-10-10"], "JOB": [6, 7, 8, 9, 10]})
df2 = pd.DataFrame({"ID": [1, 1, 2, 3, 3, 4, 4], "DOB": ["1987-12-03", "1999-06-16", "1993-04-05", "2000-01-24", "1968-11-13", "1995-05-18", "1988-12-12"], "JOB": [6, 11, 7, 8, 12, 9, 13]})
I want the output to be:
ID DOB JOB
0 1 1987-12-03 6
1 2 1993-04-05 7
2 3 2000-01-24 8
3 4 1995-05-18 9
Since those are the values that are in both DataFrames based on ID AND DOB.