2

I have a function that find the first day of the week for a given date. In this particular problem, weeks do start on Thursday.

The function works well for individual dates.

week_commencing <- function(date) {
  weekday <- lubridate::wday(date)
  if (weekday >= 5) { 
    return(date - lubridate::days(weekday) + lubridate::days(5))
  } else {
    return(date - lubridate::days(weekday) - lubridate::days(2))
  } 
}

Now, I would like to use it a pipe with dplyr. So I modified it to accept columns with Map.

week_commencing <- function(dates) {
  Map(function(date) {
    weekday <- lubridate::wday(date)
    if (weekday >= 5) { 
      return(date - lubridate::days(weekday) + lubridate::days(5))
    } else {
      return(date - lubridate::days(weekday) - lubridate::days(2))
    } 
  },dates)
}

I think the function is working, but is also applying some weird coercion to the dates because I end up with digit dates.

> test <- data.frame(datetime=seq.Date(as.Date("2016-06-01"),as.Date("2016-06-10"), by='day'))
> test
     datetime
1  2016-06-01
2  2016-06-02
3  2016-06-03
4  2016-06-04
5  2016-06-05
6  2016-06-06
7  2016-06-07
8  2016-06-08
9  2016-06-09
10 2016-06-10

> test %>% mutate(datetime=week_commencing(datetime))
   datetime
1     16947
2     16954
3     16954
4     16954
5     16954
6     16954
7     16954
8     16954
9     16961
10    16961

Any ideas on how to end up with normal date object? Is Map always applying coercion?

xav
  • 4,101
  • 5
  • 26
  • 32

2 Answers2

3

I don’t know why the class attribute is dropped here (the same happens when using other *apply functions). — The issue, deep down, seems to be that unlist drops classes:

> unlist(list(structure(1, class = 'foo')))
[1] 1

But the fix is straightforward enough: set the class at the end.

Furthermore, I’d suggest not using Map (which returns a list) but rather vapply. Then we’re left with:

week_commencing <- function(dates) {
  wc <- function(date) {
    weekday <- lubridate::wday(date)
    if (weekday >= 5) { 
      return(date - lubridate::days(weekday) + lubridate::days(5))
    } else {
      return(date - lubridate::days(weekday) - lubridate::days(2))
    } 
  }

  structure(vapply(dates, wc, numeric(1)), class = 'Date')
}

You could also use Vectorize on the function, but that also removes the class attribute.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • Thanks a lot! This works like a charm! I have never seen such syntax `class<-`(16947,'Date')... Does it have a name ? Especially the `class<-` part so that I can read about it? – xav Jul 18 '16 at 14:00
  • 2
    As an alternative to `\`class<-\`(...)` you can also use `structure(vapply(dates, wc, numeric(1)), class = "Date")`. – nrussell Jul 18 '16 at 14:04
  • @xav It simply calls the [replacement function](http://stackoverflow.com/q/11563154/1968) of `class`. I’m using this syntax to avoid creating an otherwise useless variable to hold the value (the one thing I’d like to change in the explanation of the linked answer is that you should *not* use strings for function names, it’s nonsensical; [use backtick quoting instead](http://stackoverflow.com/a/36229703/1968), as in my answer). However, I always forget that I can just use `structure` to do the same here (even though I’ve used it in my own answer), the way nrussell said. – Konrad Rudolph Jul 18 '16 at 14:16
2

Or, you could keep it in the dplyr family:

week_commencing <- function(date) {
  weekday <- lubridate::wday(date)
  dplyr::if_else(weekday >= 5,
                 date - lubridate::days(weekday) + lubridate::days(5),
                 date - lubridate::days(weekday) - lubridate::days(2))
}
hrbrmstr
  • 77,368
  • 11
  • 139
  • 205
  • Although I prefer the flexibility of Konrad's answer, I actually need to use yours at the moment, because it runs in just a few seconds on my 900k rows dataset, whereas the other solution is still processing after 1h. Thanks a lot! – xav Jul 18 '16 at 15:53