cut()
from base R has two methods for objects of class Date
and POSIXt
which assume that weeks start on Monday by default (but may be changed to Sunday using start.on.monday = FALSE
).
dates <- c("2016-04-04", "2016-04-05", "2016-04-06", "2016-04-07", "2016-04-08",
"2016-04-09", "2016-04-10", "2016-04-11", "2016-04-12", "2016-04-13",
"2016-04-14")
result <- data.frame(
dates,
cut_Date = cut(as.Date(dates), "week"),
cut_POSIXt = cut(as.POSIXct(dates), "week"),
stringsAsFactors = FALSE)
result
# dates cut_Date cut_POSIXt
#1 2016-04-04 2016-04-04 2016-04-04
#2 2016-04-05 2016-04-04 2016-04-04
#3 2016-04-06 2016-04-04 2016-04-04
#4 2016-04-07 2016-04-04 2016-04-04
#5 2016-04-08 2016-04-04 2016-04-04
#6 2016-04-09 2016-04-04 2016-04-04
#7 2016-04-10 2016-04-04 2016-04-04
#8 2016-04-11 2016-04-11 2016-04-11
#9 2016-04-12 2016-04-11 2016-04-11
#10 2016-04-13 2016-04-11 2016-04-11
#11 2016-04-14 2016-04-11 2016-04-11
Note that cut()
returns factors which is perfect for aggregation as requested by the OP:
str(result)
#'data.frame': 11 obs. of 3 variables:
# $ dates : chr "2016-04-04" "2016-04-05" "2016-04-06" "2016-04-07" ...
# $ cut_Date : Factor w/ 2 levels "2016-04-04","2016-04-11": 1 1 1 1 1 1 1 2 2 2 ...
# $ cut_POSIXt: Factor w/ 2 levels "2016-04-04","2016-04-11": 1 1 1 1 1 1 1 2 2 2 ...
However, for plotting aggregated values with ggplot2
(and if there is a large number of weeks which might clutter the axis) it might be better to switch from a discrete time scale to a continuous time scale. Then it is necessary to coerce factors back to Date
or POSIXct
:
as.Date(as.character(result$cut_Date))
as.POSIXct(as.character(result$cut_Date))