I have 2 data sets each around 20k rows.
df 1 contains the following information
first name | last name | race | sex | year of birth | unique ID
df2 contains the following
first name | last name | race | sex | age
I would like to join the data sets so that I get a unique ID for each row relevant row in DF 2. The issue is since this data is longitudinal some people are in df2 several times and have multiple ages. for example
John | smith | white | male | 29
John | smith | white | male | 30
whereas df1 contains the following
John | smith | white | male | 1991 | 74b23
ultimately I would like the data to look like this
John | smith | white | male | 29 | 74b23
John | smith | white | male | 30 | 74b23
is there a way to return exact matches on the first name, last name, race, and gender but match on age give or take a year?
Thanks!