18

I have a dataframe where one of the columns contains dates (some dates appear multiple times). I want to aggregate the dates by week. The best way I can think of this is to round down the dates to the nearest Monday. How can I round down dates? How can I transform this list of dates into weeks?

2016-04-04
2016-04-05
2016-04-06
2016-04-07
2016-04-08
2016-04-09
2016-04-10
2016-04-11
2016-04-12
2016-04-13
2016-04-14

Expected output should be this:

2016-04-04
2016-04-04
2016-04-04
2016-04-04
2016-04-04
2016-04-04
2016-04-04
2016-04-11
2016-04-11
2016-04-11
2016-04-11
Uwe
  • 41,420
  • 11
  • 90
  • 134
Jaol
  • 319
  • 1
  • 2
  • 10
  • 1
    Possible duplicate of [R: How to judge Date in the same week?](http://stackoverflow.com/questions/43775261/r-how-to-judge-date-in-the-same-week) – Uwe May 05 '17 at 20:28
  • 1
    Seems like [this](http://stackoverflow.com/questions/26160117/changing-lubridate-function-to-start-on-monday-rather-than-sunday) could help. – Nick Criswell May 05 '17 at 20:28
  • 1
    you could just subtract the `wday` from your date. `lubridate` and `data.table` have implementations of this function. – MichaelChirico May 05 '17 at 20:29
  • 1
    `cut.Date()` starts weeks on Mondays by default. `lubridate` and `data.table` start weeks on Sundays. – Uwe May 05 '17 at 20:33
  • @uwe-block Thanks, that works perfect. I just tried `cut.POSIXt(table$date, breaks = "week")` and works. (I have my dates stored as POSIXct) – Jaol May 05 '17 at 20:43

3 Answers3

22

With the week_startparameter in the floor_date function of the lubridate package you have the option to specify the beginning of the week since lubridate version 1.7.0. This allows you to perform:

library(lubridate)
dates <- seq.Date(as.Date("2016-04-04"), as.Date("2016-04-14"), by = 1)
floor_date(dates, "weeks", week_start = 1)

I would post it as a comment to Sraffa's response but I don't have the reputation.

Patrick Glettig
  • 541
  • 1
  • 6
  • 12
19

cut() from base R has two methods for objects of class Date and POSIXt which assume that weeks start on Monday by default (but may be changed to Sunday using start.on.monday = FALSE).

dates <- c("2016-04-04", "2016-04-05", "2016-04-06", "2016-04-07", "2016-04-08", 
           "2016-04-09", "2016-04-10", "2016-04-11", "2016-04-12", "2016-04-13", 
           "2016-04-14")
result <- data.frame(
  dates,
  cut_Date = cut(as.Date(dates), "week"),
  cut_POSIXt = cut(as.POSIXct(dates), "week"),
  stringsAsFactors = FALSE)

result
#        dates   cut_Date cut_POSIXt
#1  2016-04-04 2016-04-04 2016-04-04
#2  2016-04-05 2016-04-04 2016-04-04
#3  2016-04-06 2016-04-04 2016-04-04
#4  2016-04-07 2016-04-04 2016-04-04
#5  2016-04-08 2016-04-04 2016-04-04
#6  2016-04-09 2016-04-04 2016-04-04
#7  2016-04-10 2016-04-04 2016-04-04
#8  2016-04-11 2016-04-11 2016-04-11
#9  2016-04-12 2016-04-11 2016-04-11
#10 2016-04-13 2016-04-11 2016-04-11
#11 2016-04-14 2016-04-11 2016-04-11

Note that cut() returns factors which is perfect for aggregation as requested by the OP:

str(result)
#'data.frame':  11 obs. of  3 variables:
# $ dates     : chr  "2016-04-04" "2016-04-05" "2016-04-06" "2016-04-07" ...
# $ cut_Date  : Factor w/ 2 levels "2016-04-04","2016-04-11": 1 1 1 1 1 1 1 2 2 2 ...
# $ cut_POSIXt: Factor w/ 2 levels "2016-04-04","2016-04-11": 1 1 1 1 1 1 1 2 2 2 ...

However, for plotting aggregated values with ggplot2 (and if there is a large number of weeks which might clutter the axis) it might be better to switch from a discrete time scale to a continuous time scale. Then it is necessary to coerce factors back to Date or POSIXct:

as.Date(as.character(result$cut_Date))
as.POSIXct(as.character(result$cut_Date))
Uwe
  • 41,420
  • 11
  • 90
  • 134
  • `cut(as.POSIXct(dates), "week")` may return Sunday instead of Monday, I think it is the timezone problem – Ibo Feb 05 '19 at 20:04
13

With lubridate you could try this:

library(lubridate)
dates <- seq.Date(as.Date("2016-04-04"), as.Date("2016-04-14"), by = 1)
floor_date(dates - 1, "weeks") + 1

floor_date starts weeks on Sundays, so to avoid those being included in the next week you have to subtract one before rounding and then increase the value by one day.

Sraffa
  • 1,658
  • 12
  • 27