Thanks in advance!
I have two large datasets, both contain columns of date/time fields that are of interest. The first (the head()
of which is pasted below), has a single date/time field that I am interested in – the ‘RoundDateTimeGMT’ column. This datasheet is rather large (over 500,000 rows). The data is specific to an individual noted by the PumaID column.
PumaID RoundDateTimeGMT
1 P01 3/3/2011 0:00
2 P01 3/3/2011 0:00
3 P01 3/3/2011 0:00
4 P01 3/3/2011 0:00
5 P01 3/3/2011 0:00
6 P01 3/3/2011 0:00
The second dataset has two date/time fields representing a start and end time (‘FstClstrTime’ and ‘LastClstrTime’ respectively) (below). All times have been converted to a recognizable R format using as.POSIXct(). As above, these data are also specific to an individual noted by the PumaID column.
PumaID FstClstrTime LastClstrTime
1 P01 8/29/2011 6:01 8/29/2011 8:01
2 P01 <NA> <NA>
3 P01 9/10/2011 2:00 9/12/2011 12:01
4 P01 9/9/2011 8:00 9/9/2011 14:01
5 P01 9/7/2011 8:01 9/8/2011 10:00
6 P01 9/4/2011 10:01 9/6/2011 12:01
My goal is to create a new binary column within the first dataset that indicates if the RoundDateTimeGMT is between the ‘FstClstrTime’ and ‘LastClstrTime’ of the second datasheet for each individual. I only need to check if RoundDateTimeGMT is between the ‘FstClstrTime’ and ‘LastClstrTime’ if the PumaID’s of each data sheet match. I think this can be done with a for() loop, but am open to any suggestions. I just need to check every RoundDateTimeGMT (again there are over 500,000) to every FstClstrTime’ and ‘LastClstrTime for each individual.
With the large datasets dput()
does not work so apologies for not attaching any data. I hope you can still offer some suggestions as how to accomplish the above goal.
Kind regards!