I need to merge two tables by zip codes such that date ranges from table, ID_dates should contain the ranges in table, zipcodes.
Description of data:
Table ID_dates gives pre and post that covers the entire range of dates for ID_dates.
Combination of ID and pre gives a unique combination for the group within which the ranges in zip codes must lie.
Ranges in ID_dates are continuous with no gaps.
Zipcode is string but I am unable to have R read as a character in my example below (I am new to R so please forgive my mistakes in the code below,if any)
The ranges in zipcodes do not cover the range of pre and post. The ranges in zipcodes may lie outside of the pre and post dates in ID_dates and hence the start and end dates of zipcodes where this happens have to be truncated. For each group (combination of zip code and ID), the endpoints are pre and post.
zipcodes <- read.table(text="
zipcode start end var1
08210 03/15/1992 03/29/1992 1
08210 06/29/1993 06/30/1993 2
07039 03/28/1992 03/31/1992 5
07046 06/6/1994 06/6/1994 1
", colClasses='character',header=T, stringsAsFactors=F)
zipcodes$start <- as.Date(zipcodes$start,format = "%m/%d/%y")
zipcodes$end <- as.Date(zipcodes$end,format = "%m/%d/%y")
ID_dates <- read.table(text="
ID zipcode t1 t2 pre post var2
A 08210 03/17/1992 03/28/1992 03/17/1992 06/01/1993 1
A 08210 03/29/1992 04/05/1992 03/17/1992 06/01/1993 2
A 07039 04/06/1992 06/01/1993 03/17/1992 06/01/1993 3
A 07046 07/07/1995 08/07/1995 07/07/1995 08/07/1995 5
B 90002 01/01/1996 02/01/1996 01/01/1996 02/01/1996 8
", colClasses='character',header=T, stringsAsFactors=F)
ID_dates$t1 <- as.Date(ID_dates$t1,format = "%m/%d/%y")
ID_dates$t2 <- as.Date(ID_dates$t2,format = "%m/%d/%y")
ID_dates$pre <- as.Date(ID_dates$pre,format = "%m/%d/%y")
ID_dates$post <- as.Date(ID_dates$post,format = "%m/%d/%y")
finaltable <- read.table(text="
id zipcode t1 t2 var1 var2
A 08210 03/17/1992 03/28/1992 1 1
A 08210 03/29/1992 04/05/1992 NA 2
A 07039 04/06/1992 06/01/1993 NA 3
A 07046 07/07/1995 08/07/1995 NA 5
B 90002 01/01/1996 02/01/1996 NA 8
", header=T, stringsAsFactors=F)
I am unable to write a code to achieve the above result.