0

I have two data frames that need to be joined in R. I've included an example data set below. One data frame includes id, a number that changes when the date range changes, and a range of date. The range is given by two columns: one column identifies the beginning of the period and another the end of the period. I need to match that row and all the information to another data set that includes gps pings that occur on and between the range provided. For example, for unit 1, between 12/1/2020 and 12/2/2020 rain is 10. I want the information about rain being 10 to be joined to a data set for pings on and between 12/1/2020 and 12/2/2020.

The final, joined data set will be joined on the ID and the date range (df3).

df1
ID    rain   begdate    enddate
1     10     12/1/2020  12/2/2020
1     15     12/3/2020  12/4/2020
2     20     12/2/2020  12/4/2020
4     40     12/1/2020  12/3/2020
4     45     12/4/2020  12/5/2020
3     30     12/2/2020  12/3/2020

df2
lat   long   ID    Date 
47    10     1     12/1/2020
44    20     1     12/1/2020
47    10     1     12/2/2020
44    20     1     12/2/2020
47    10     1     12/3/2020
44    20     1     12/4/2020
46    30     2     12/2/2020
47    28     2     12/3/2020
46    32     2     12/4/2020
43    39     4     12/1/2020
44    39     4     12/2/2020
43    38     4     12/2/2020
43    36     4     12/3/2020
43    39     4     12/4/2020
44    39     4     12/4/2020
43    38     4     12/5/2020
43    36     4     12/5/2020
44    38     3     12/2/2020
45    22     3     12/2/2020
47    23     3     12/3/2020


df3
lat   long   ID    Date         rain   begdate    enddate
47    10     1     12/1/2020    10     12/1/2020  12/2/2020
44    20     1     12/1/2020    10     12/1/2020  12/2/2020
47    10     1     12/2/2020    10     12/1/2020  12/2/2020
44    20     1     12/2/2020    10     12/1/2020  12/2/2020
47    10     1     12/3/2020    10     12/1/2020  12/2/2020
46    30     2     12/2/2020    20     12/2/2020  12/4/2020
47    28     2     12/3/2020    20     12/2/2020  12/4/2020
46    32     2     12/4/2020    20     12/2/2020  12/4/2020
43    39     4     12/1/2020    40     12/1/2020  12/3/2020
44    39     4     12/2/2020    40     12/1/2020  12/3/2020
43    38     4     12/2/2020    40     12/1/2020  12/3/2020
43    36     4     12/3/2020    40     12/1/2020  12/3/2020
43    39     4     12/4/2020    45     12/4/2020  12/5/2020
44    39     4     12/4/2020    45     12/4/2020  12/5/2020
43    38     4     12/5/2020    45     12/4/2020  12/5/2020
43    36     4     12/5/2020    45     12/4/2020  12/5/2020
44    38     3     12/2/2020    30     12/2/2020  12/3/2020
45    22     3     12/2/2020    30     12/2/2020  12/3/2020
47    23     3     12/3/2020    30     12/2/2020  12/3/2020

I've tried to join the datasets using fuzzy_join and sqldf with no luck. Any and all help would be wonderful!

Tham v
  • 51
  • 4
  • This should work: `library(data.table);setDT(df1);setDT(df2);df1[,c("begdate","enddate") := lapply(.SD,as.Date,"%m/%d/%Y"),.SDcols = c("begdate","enddate")];df2[,Date := as.Date(Date,"%m/%d/%Y")];df2[df1, on = .(ID, Date >= begdate , Date <= enddate)]` – Ian Campbell Jan 08 '21 at 04:33
  • @IanCampbell That worked!! Thank you so much, I've been banging my head against the wall trying to get this to work for so long!! – Tham v Jan 08 '21 at 05:10

0 Answers0