6

I have a data frame with Date in the "%d-%m-%Y" format and have the week number. The dates are weekdays and I want the the Saturday for that week in another column.

I had initially check whether the date is a weekday or weekend using function in the Chron package but that was a Boolean validation. I had formatted the date variable to Date format and extracted the week number for each date.

df = data.frame(date=c("2014-08-20", "2014-08-25", "2014-10-08")) 
df$date=as.Date(df$date,format="%Y-%m-%d")
df$week=week(ymd(df$date))

The expected result should be:

date        week    EOW  
2014-08-20   34   2014-08-23

2014-08-25   34   2014-08-30

2014-10-08   41   2014-10-11
Ami
  • 197
  • 1
  • 12

4 Answers4

7

Base R option. First create a list of all days, then match it with weekdays and subtract it from 6 (as we want Saturday) to get how many days we need to add in the original date column.

all_days <- c("Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday")

#As @nicola mentioned this is locale dependent
#If your locale is not English you need weekdays in your current locale
#which you can manually write as shown above or do any one of the following

#all_days <- weekdays(seq(as.Date("2019-01-14"),by="day",length.out=7))
#OR
#all_days <- format(seq(as.Date("2019-01-14"),by="day",length.out=7), "%A")

df$EOW <- df$date + 6 - match(weekdays(df$date), all_days)

df
#        date week        EOW
#1 2014-08-20   34 2014-08-23
#2 2014-08-25   34 2014-08-30
#3 2014-10-08   41 2014-10-11

Or lubridate has a function ceiling_date which when used with unit = "week" would return you the next "Sunday" so we subtract 1 day from it to get "Saturday" instead.

library(lubridate)
df$EOW <- ceiling_date(df$date, unit = "week") - 1
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

Another way using

library(data.table)
df <- data.table(date=c("2014-08-20", "2014-08-25", "2014-10-08")) 
df$date=as.Date(df$date,format="%Y-%m-%d")
df$week=week(ymd(df$date))

## if the locale is not English, please use the local values for days 
days <- data.frame(DOW = c("Monday", "Tuesday", "Wednesday", "Thursday","Friday", "Saturday", "Sunday"))
days$day <- seq(1,7,1)

df <- df[,DOW:= weekdays(date)]
df <- merge(df, days, all.x = T, by = "DOW")

df <- df[, EOW := date + (6 - day)]
df

         DOW       date week day        EOW
1:    Monday 2014-08-25   34   1 2014-08-30
2: Wednesday 2014-08-20   34   3 2014-08-23
3: Wednesday 2014-10-08   41   3 2014-10-11
Hardik Gupta
  • 4,700
  • 9
  • 41
  • 83
0

Another Approach. This is for Saturday. If you want Thursday of that week just use '5' instead of '7' in the calculation, etc..

library(lubridate)
library(dplyr)

df <- data.table(date=as.Date(c("2019-01-04","2019-01-07", "2019-01-15", "2019-01-26","2019-01-27")))

df %>% mutate(cw = wday(date)) %>%
  mutate(nSaturday = date+(7-cw))

        date cw  nSaturday
1 2019-01-04  6 2019-01-05
2 2019-01-07  2 2019-01-12
3 2019-01-15  3 2019-01-19
4 2019-01-26  7 2019-01-26
5 2019-01-27  1 2019-02-02
  • 1
    Optimized in what sense? – s_baldur Jan 30 '19 at 10:20
  • @snoram, Without initiating a weekdays array again, we can make use of `wday()` function in a single line code. – Sai Prabhanjan Reddy Feb 01 '19 at 05:10
  • 1
    Why is this optimised? If anything this has more overhead as it depends on more external R libraries. Why use a `data.table` but then perform `tidyverse` operations? – Maurits Evers Feb 28 '19 at 07:48
  • Optimized doesn't mean it should not depend on more external libraries! It literally means less number of lines of code and speed of execution!! @Maurits, hope you understood the point here – Sai Prabhanjan Reddy Feb 28 '19 at 07:53
  • 1
    Condescension will not get you very far around here @SaiPrabhanjanReddy; optimised code has absolutely *nothing* to do with the number of lines of code (unless you're code golfing, in which case you're on the wrong forum). If you have such sweeping statements such as "optimised code", back it up! Run a benchmark analysis! I can *guarantee* that your solution is not as performant as some of the other solutions presented here (in particular when compared to the full `data.table` based solutions). Which makes your statement plain wrong. – Maurits Evers Feb 28 '19 at 08:16
0

Playing around with data.table join syntax:

library(data.table)

# Create a saturdays dataset
saturdays2014 <- data.table(date = seq(as.Date("2014-01-01"), as.Date("2014-12-31"), by = 1))
Sys.setlocale("LC_ALL","English")
saturdays2014 <- saturdays2014[weekdays(date) == "Saturday"]

# convert df to data.table and date to a Date variable
setDT(df)[, date := as.Date(date)]

# Join
df[saturdays2014, on = "date", roll = 6, EOW := i.date]
df    
#          date        EOW
# 1: 2014-08-20 2014-08-23
# 2: 2014-08-25 2014-08-30
# 3: 2014-10-08 2014-10-11
s_baldur
  • 29,441
  • 4
  • 36
  • 69
  • Also local depdendent, but should be enough for filtering to know one date that is a saturday and from there use the fact they occur every 7th day. – s_baldur Jan 30 '19 at 10:19