I have hundreds of thousands of records with a start time, stop time, and activity type for that timeframe. I'd like to be able to find the total amount of time that overlaps with specific, preset time intervals and group by the type of activity.
Here's what I have:
start stop activity
2015-12-24 12:55:00.000 2015-12-24 13:25:00.000 a
2015-12-24 13:45:00.000 2015-12-24 13:59:00.000 b
2015-12-24 13:55:00.000 2015-12-24 14:10:00.000 b
2015-12-24 14:13:00.000 2015-12-24 15:05:00.000 a
And here's an example of what I'd like to have as an end result. This is shown grouped by the half hour but being able to designate any normal time interval(day, hour, half-hour, quarter-hour) is ideal:
intervalStart activityMinutes activity
2015-12-24 12:30:00.000 5 a
2015-12-24 13:00:00.000 25 a
2015-12-24 13:30:00.000 19 b
2015-12-24 14:00:00.000 17 a
2015-12-24 14:00:00.000 10 b
2015-12-24 14:30:00.000 30 a
2015-12-24 15:00:00.000 5 a
I have a SQL query that works very well for this that I got from a previous stackoverflow question that I posted earlier this year:
However, using this query is really cumbersome in R and doesn't always work depending on the date range I'm looking at. It's also very slow and I'm hoping that a solution withing R will be faster and more reliable.
Thanks for the help and let me know if there's any other information I can provide!
EDIT - Edited to show results in 30-min interval time buckets