I need to join two datasets together:
- time series data taken every second with a heart rate measure (with 1.4 million records)
>allsecsHR
timestamp HRbpm
2023-03-22 09:04:53 101
2023-03-22 09:04:54 124
2023-03-22 09:04:55 103
2023-03-22 09:04:56 111
2023-03-22 09:04:57 112
2023-03-22 09:04:58 143
2023-03-22 09:04:59 109
2023-03-22 09:05:00 129
2023-03-22 09:05:01 122
2023-03-22 09:05:02 125
2023-03-22 09:05:03 110
- behavioural records with a start and end timestamp
>bhr
Behaviour BhrTimeStart BhrTimeEnd
Forage 2023-03-22 09:04:53 2023-03-22 09:04:58
Vigilance 2023-03-22 09:04:58 2023-03-22 09:05:03
Forage 2023-03-22 09:05:03 2023-03-22 09:05:10
At the end I would like a dataset that has each row representing one second, with the behaviour that was being performed per second
I have tried to do this in mySQL and in R using sqldf and powerjoin but it keeps losing connection to the sql server after running for 2 hours or in R it does not complete and crashes or returns Error: vector memory exhausted (limit reached?)
. I would really appreciate some help to find an efficient way to do this!
I think a data.table solution might be the fastest but I am not sure how to do it, a tidy solution would also be great!
So far I have tried:
mySQL/sqldf
library(sqldf)
sqldf("select * from allsecsHR
left join bhr
on allsecsHR.timestamp between bhr.BhrTimeStart and bhr.BhrTimeEnd")
powerjoin
library(powerjoin)
power_left_join(
allsecsHR, bhr,
by = ~.x$timestamp > .y$BhrTimeStart &
(.x$timestamp < .y$BhrTimeEnd | is.na(.y$BhrTimeEnd)),
keep = "left")
The output table I would like is this:
timestamp HRbpm Bhr BhrTimeStart BhrTimeEnd
2023-03-22 09:04:53 101 Forage 2023-03-22 09:04:53 2023-03-22 09:04:58
2023-03-22 09:04:54 124 Forage 2023-03-22 09:04:53 2023-03-22 09:04:58
2023-03-22 09:04:55 103 Forage 2023-03-22 09:04:53 2023-03-22 09:04:58
2023-03-22 09:04:56 111 Forage 2023-03-22 09:04:53 2023-03-22 09:04:58
2023-03-22 09:04:57 112 Forage 2023-03-22 09:04:53 2023-03-22 09:04:58
2023-03-22 09:04:58 143 Vigilance 2023-03-22 09:04:58 2023-03-22 09:05:03
2023-03-22 09:04:59 109 Vigilance 2023-03-22 09:04:58 2023-03-22 09:05:03
2023-03-22 09:05:00 129 Vigilance 2023-03-22 09:04:58 2023-03-22 09:05:03
2023-03-22 09:05:01 122 Vigilance 2023-03-22 09:04:58 2023-03-22 09:05:03
2023-03-22 09:05:02 125 Vigilance 2023-03-22 09:04:58 2023-03-22 09:05:03
2023-03-22 09:05:03 110 Forage 2023-03-22 09:05:03 2023-03-22 09:05:10