I'm working with a dataframe including the columns 'timestamp' and 'amount'. The data can be produced like this
sample_size <- 40 start_date = as.POSIXct("2020-01-01 00:00") end_date = as.POSIXct("2020-01-03 00:00") timestamps <- as.POSIXct(sample(seq(start_date, end_date, by=60), sample_size)) amount <- rpois(sample_size, 5) df <- data.frame(timestamps=timestamps, amount=amount)
Now I'd like to plot the sum of the amount
entries for some timeframe (like every hour, 30 min
, 20 min
). The final plot would look like a histogram of the timestamps but should not just count how many timestamps fell into the timeframe, but what amount fell into the timeframe.
How can I approach this? I could create an extra vector with the amount of each timeframe, but don't know how to proceed.
Also I'd like to add a feature to reduce by hour. Such that just just one day is plotted (notice the range between start_date
and end_date
is two days) and in each timeframe (lets say every hour) the amount of data located in this hour is plotted. In this case the data
2020-01-01 13:03:00 5
2020-01-02 13:21:00 10
2020-01-02 13:38:00 1
2020-01-01 13:14:00 3
would produce a bar of height sum(5, 10, 1, 3) = 19
in the timeframe 13:00-14:00
. How can I implement the plotting to easily switch between these two modes (plot days/plot just one day and reduce)?
EDIT: Following the advice of @Gregor Thomas I added a grouping column like this:
df$time_group <- lubridate::floor_date(df$timestamps, unit="20 minutes")
Now I'm wondering how to ignore the dates and thus reduce by 20 minute frame (independent of date).