I have two data.frames with information.
The first one is the biggest, and contains a lot of rows for unique speed measurements in curves. So it has columns with a curveID, a datetime of the measurement, and a lot of columns with other speed and curve information.
The second one is a small database, containing known roadworks. It contains three columns: curveID, from and to. curveID has the same factors as the first database, from is a datetime (POSIXct) is the start roadworks on that roadworks on that section and to is a datetime (POSIXct) containing the end of roadworks.
Quite simplified the databases look like this:
speedmeasurements <- data.frame("curve_id" = c(1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3), "datetime" = c(0, 2, 8, 9, 1, 2, 3, 4, 5, 1, 3, 5))
roadworks <- data.frame("curve_id" = c(1, 1, 3), "from" = c(1, 5, 2), "to" = c(3, 7, 4))
I've added a column to create intervals, like this roadworks$range <- interval(roadworks$from, roadworks$to)
Now I want to add a column roadworks to my first database. This should be a logical value, which checks whether or not a speed measurement was done during roadworks. So, I need a bit of code which checks if the combination of curveID and datetime of the speed measurement are within roadworks timeslots.
In the simpified example, I would like to get this result:
speedmeasurements <- data.frame("curve_id" = c(1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3), "datetime" = c(0, 2, 8, 9, 1, 2, 3, 4, 5, 1, 3, 5), "roadworks" = c(F, T, F, F, F, F, F, F, F, F, T, F))
I've been thinking of an ifelse()
like this:
speedmeasurements$roadworks<- ifelse(speedmeasurements$curve_id == roadworks$curve_id & speedmeasurements$datetime %within% roadworks$range, T, F)
, but this seems to fail due to different object lengths.
Does anyone have a way forward? Perhaps a data.table
solution, but I'm quite novice in that area.
Kind regards,
Johan