2

I have a tibble with dates in one column I would like to round to the nearest quarter. Using lubridate::round_date() each date seems to just round down to the nearest quarter. I am trying to solve so that some dates round down and some dates round up.

library(tidyverse)
library(lubridate)

my_tibble <- tibble(my_dates = seq(ymd('2022-01-01'), ymd('2022-03-31'), by = 'days'))

my_tibble <- 
  my_tibble %>%
  mutate(qtr_date = round_date(my_dates, unit = "quarter"))

The early dates should round down, which they do:

> head(my_tibble)
# A tibble: 6 × 2
  my_dates   qtr_date           
  <date>     <dttm>             
1 2022-01-01 2022-01-01 00:00:00
2 2022-01-02 2022-01-01 00:00:00
3 2022-01-03 2022-01-01 00:00:00
4 2022-01-04 2022-01-01 00:00:00
5 2022-01-05 2022-01-01 00:00:00
6 2022-01-06 2022-01-01 00:00:00

But the later dates also round down:

> tail(my_tibble)
# A tibble: 6 × 2
  my_dates   qtr_date           
  <date>     <dttm>             
1 2022-03-26 2022-01-01 00:00:00
2 2022-03-27 2022-01-01 00:00:00
3 2022-03-28 2022-01-01 00:00:00
4 2022-03-29 2022-01-01 00:00:00
5 2022-03-30 2022-01-01 00:00:00
6 2022-03-31 2022-01-01 00:00:00

I was expecting the dates after the midpoint (2022-02-15) to round up to second quarter date.

If I wanted the dates in the quarter to always round up or round down I would have used cieling_date() or floor_date().

Is there someway to modify round_date() so that it actually rounds up or down?


Here is my info from sessioninfo()

> sessionInfo()
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 22621)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.8.0 forcats_0.5.1   stringr_1.4.0   dplyr_1.0.9     purrr_0.3.4    
 [6] readr_2.1.2     tidyr_1.2.0     tibble_3.1.6    ggplot2_3.3.6   tidyverse_1.3.1

loaded via a namespace (and not attached):
 [1] cellranger_1.1.0 pillar_1.7.0     compiler_4.2.0   dbplyr_2.1.1     tools_4.2.0     
 [6] jsonlite_1.8.0   lifecycle_1.0.1  gtable_0.3.0     pkgconfig_2.0.3  rlang_1.0.3     
[11] reprex_2.0.1     DBI_1.1.2        cli_3.3.0        rstudioapi_0.13  haven_2.5.0     
[16] xml2_1.3.3       withr_2.5.0      httr_1.4.3       fs_1.5.2         generics_0.1.2  
[21] vctrs_0.4.1      hms_1.1.1        grid_4.2.0       tidyselect_1.1.2 glue_1.6.2      
[26] R6_2.5.1         fansi_1.0.3      readxl_1.4.0     tzdb_0.3.0       modelr_0.1.8    
[31] magrittr_2.0.3   backports_1.4.1  scales_1.2.0     ellipsis_0.3.2   rvest_1.0.2     
[36] assertthat_0.2.1 colorspace_2.0-3 utf8_1.2.2       stringi_1.7.6    munsell_0.5.0   
[41] broom_0.8.0      crayon_1.5.1 
SamR
  • 8,826
  • 3
  • 11
  • 33
Chris Kiniry
  • 499
  • 3
  • 13
  • 1
    I cannot reproduce your problem. On my system, everything from 2022-02-15 (and above) gets rounded up to 2022-04-01; using your sample data and code. – Wimpel Dec 26 '22 at 14:45
  • 1
    Dang, I hate when stuff like that happens. Weird. – Chris Kiniry Dec 26 '22 at 14:50
  • try in a new R session, with only the packages needed loaded. – Wimpel Dec 26 '22 at 14:51
  • It's generally a good idea to use a new name for the modified tibble, in case you accidentally run the code twice. I don't see how that would be the issue here, but it can cause confusion in some cases. BTW, I see the same results as @Wimpel. – user2554330 Dec 26 '22 at 14:54
  • I turned my computer back on, started a new session of R, only loaded tidyverse and lubridate. I am still getting all dates rounded down to '2022-01-01'. – Chris Kiniry Dec 26 '22 at 14:54
  • Could you add the output of `sessionInfo()` to your post? – user2554330 Dec 26 '22 at 14:55
  • You are not using the latest version of either R and/or the tidyverse/lubridate packages.. But i faul to see how that could cause the problem here... weird. – Wimpel Dec 26 '22 at 14:59
  • 1
    there seems to be an issue in `round_)date()` with older versions.. . try updating your lubridate package to latest version, see: https://github.com/tidyverse/lubridate/issues/1073 – Wimpel Dec 26 '22 at 15:03
  • 2
    Yes, I installed the newest lubridate version and this fixed my problem! Thank you! – Chris Kiniry Dec 26 '22 at 15:06

1 Answers1

4

I could replicate this issue in lubridate v1.8.0. If you look at the source for round_date() you will see that this function has been completely refactored. The work is now done by the line:

timechange::time_round(x, unit = unit, week_start = as_week_start(week_start))

round_date() was previously calling floor_date() and ceiling_date() and finding which was nearest. We can see that this change was made in the commit on November 4th 2022 (line 174).

This does not entirely explain why your code did not work, but knowing round_date() is now calculated differently, I updated lubridate to the latest CRAN version (1.9.0) with:

install.packages("lubridate")

It is also possible to install a specific version of a package as described here.

Updating to 1.9.0 also installed timechange as a dependency (v.0.1.1), which fixed the problem.

SamR
  • 8,826
  • 3
  • 11
  • 33