Your last example should have a Within_7days value of 1 for ID of 3 according to your data.
The package lubridate
is great for working with times. Also, I use dplyr
for manipulating datasets.
# Load Libraries
library(dplyr)
library(lubridate)
# Recreate dataframes
df1 <- data.frame(ID = 1:3, Date.Discharged = c("2014-08-04","2014-12-10","2015-01-01"))
df2 <- data.frame(ID = c(1,2,2,3), CheckDate = c("2014-08-11","2014-12-14","2015-05-01","2015-01-05"))
# Make the characters in Date.Discharged and CheckDate into Date objects:
df1$Date.Discharged <- as.Date(df1$Date.Discharged)
df2$CheckDate <- as.Date(df2$CheckDate)
I assume that you only care about the minimum value for each ID for the check-in Date. That way if you select the minimum value, you get rid of any problems with multiple entries with the same ID.
df2 <- df2 %>% group_by(ID) %>% summarize(CheckDate = min(CheckDate))
# Now join the 2 dataframes together
df1 <- left_join(df1,df2) # this joins by ID by default
Finally, we make the column with Y/N
df1 <- df1 %>% mutate(within_7days = (CheckDate <= Date.Discharged + days(7))*1)
df1
ID Date.Discharged CheckDate within_7days
1 1 2014-08-04 2014-08-11 1
2 2 2014-12-10 2014-12-14 1
3 3 2015-01-01 2015-01-05 1
EDIT:
This is slightly inefficient, but will handle the case where you have values in the 2nd dataset that occur before the discharge date:
We can ignore the following line above:
df2 <- df2 %>% group_by(ID) %>% summarize(CheckDate = min(CheckDate))
and instead go straight to the join step and modify it as follows:
df1 <- left_join(df1,df2) %>% filter(CheckDate > Date.Discharged)
This will add all of the CheckDates that take place each Date Discharged for the same ID. This is again assuming that the ID's in DF1 are unique.