0

Hello I am trying to find the week number for a series of date over three years. However R is not giving the correct week number. I am generating a seq of dates from 2016-04-01 to 2019-03-30 and then I am trying to calculate week over three years such that I get the week number 54, 55 , 56 and so on.

However when I check the week 2016-04-03 R shows the week number as 14 where as when cross checked with excel it is the week number 15 and also it simply calculates 7 days and does not reference the actual calendar days. Also the week number starts from 1 for every start of year

The code looks like this

days <- seq(as.Date("2016-04-03"),as.Date("2019-03-30"),'days')
weekdays <- data.frame('days'=days, Month = month(days), week = week(days),nweek = rep(1,length(days)))

This is how the results looks like

   days    week
2016-04-01  14
2016-04-02  14
2016-04-03  14
2016-04-04  14
2016-04-05  14
2016-04-06  14
2016-04-07  14
2016-04-08  15
2016-04-09  15
2016-04-10  15
2016-04-11  15
2016-04-12  15

However when checked from excel this is what I get

   days    week
2016-04-01  14
2016-04-02  14
2016-04-03  15
2016-04-04  15
2016-04-05  15
2016-04-06  15
2016-04-07  15
2016-04-08  15
2016-04-09  15
2016-04-10  16
2016-04-11  16
2016-04-12  16

Can someone please help me identify wherever I am going wrong.

Thanks a lot in advance!!

Analytics_TM
  • 493
  • 6
  • 28
  • 1
    Your example data starts at `2016-04-01`, but your code starts from `2016-04-03` ? – neilfws May 09 '19 at 00:12
  • The different week conventions are nicely explained here: [Transform year/week to date object](https://stackoverflow.com/a/45587644/1851712) – Henrik May 09 '19 at 07:37

1 Answers1

8

Not anything that you're doing wrong per se, there is just a difference in how R (I presume you're using the lubridate package) and Excel calculate week numbers.

  • R will calculate week numbers based on the seven day block from 1 January that year; but
  • Excel calculates week numbers based on a week starting from Sunday.

Taking the first few days of January 2016 for an example. On, Friday, 1 January 2016, both R and Excel will say this is week 1.

On Sunday, 3 January 2016:

  • this is within the first seven days of the start of the year so R will return week number 1; but
  • it is a Sunday, so Excel ticks over to week number 2.

Try this:

ifelse(test = weekdays.Date(days[1]) == "Sunday", yes = epiweek(days[1]), no = epiweek(days[1]) + 1) + cumsum(weekdays.Date(days) == "Sunday")

This tests whether the first day is a Sunday or not and returns an appropriate week number starting point, then adds on one more week number each Sunday. Gives the same week number if there's overlap between years.

Spherical
  • 396
  • 1
  • 4
  • 7
    See also `?week`, which describes `isoweek` and `epiweek`. You can get the same numbers as from Excel by using `epiweek(days) + 1`. – neilfws May 09 '19 at 00:17
  • 5
    ...or use the built-in R tools and do `as.numeric(format(days, "%U"))+1` which is all described in `?strptime` where the `"%U"` format is the US convention starting at 0 instead of 1 (which Excel seems to use) and `"%V"` is the ISO 8601 standard. – thelatemail May 09 '19 at 00:21
  • @neilfws - I don't think they're exactly equivalent. There would still be a difference if the first week crosses into a new year. – Ritchie Sacramento May 09 '19 at 00:25
  • @H1 You may be right. By "same numbers" I meant as in the example data in the question. – neilfws May 09 '19 at 00:26
  • So, I have tried both of this `as.numeric(format(days, "%V"))+1 and epiweek(days) + 1` and it works as far as one year day range is taken in to account. It just depends if you want to count Sunday as the start of the week or not, however since I have three years of data it is converting the next week as 2 where as I am looking for something that can give 54th week and so on after it completes the 53rd week. – Analytics_TM May 09 '19 at 01:44
  • @Spherical, That's goog info, but since I am working on marketing data it becomes imperative to calculate how excel calculates week number. I am trying to find a solution for that. The above suggestions does help for a year range but as the change it ticks it over again to start the week number again and I was wondering if some how I can get the week number continued and not start again. – Analytics_TM May 09 '19 at 02:00
  • @AnalyticsTeam Can you just clarify what you're looking for? My understanding is that you're looking for the result Excel would give you, but that doesn't reset back to `week = 1` at the start of new year? – Spherical May 09 '19 at 03:12
  • @Spherical, Sure. All I want is to calculate the week number for a range of dates that takes into fact the actual calendar dates and the week starts on Sunday, does not reset to 1 when the year changes and if there is a week overlap between years can take care of that and gives the same week number as of the last year last week number. I hope that makes things clear – Analytics_TM May 09 '19 at 03:29
  • 1
    Try this: `ifelse(test = weekdays.Date(days[1]) == "Sunday", yes = epiweek(days[1]), no = epiweek(days[1]) + 1) + cumsum(weekdays.Date(days) == "Sunday")`. This tests whether the first day is a Sunday or not and returns an appropriate week number starting point, then adds on one more week number each Sunday. Gives the same week number if there's overlap between years. – Spherical May 09 '19 at 04:02
  • @Spherical, This is brilliant. This is exactly what I was looking for and works like magic. Thanks a lot for putting in the effort and time – Analytics_TM May 09 '19 at 04:33
  • @AnalyticsTeam Not a problem - I've edited my answer above. Can you accept if you're happy? – Spherical May 09 '19 at 07:17